4 resultados para Motifs

em Bucknell University Digital Commons - Pensilvania - USA


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: In protein sequence classification, identification of the sequence motifs or n-grams that can precisely discriminate between classes is a more interesting scientific question than the classification itself. A number of classification methods aim at accurate classification but fail to explain which sequence features indeed contribute to the accuracy. We hypothesize that sequences in lower denominations (n-grams) can be used to explore the sequence landscape and to identify class-specific motifs that discriminate between classes during classification. Discriminative n-grams are short peptide sequences that are highly frequent in one class but are either minimally present or absent in other classes. In this study, we present a new substitution-based scoring function for identifying discriminative n-grams that are highly specific to a class. Results: We present a scoring function based on discriminative n-grams that can effectively discriminate between classes. The scoring function, initially, harvests the entire set of 4- to 8-grams from the protein sequences of different classes in the dataset. Similar n-grams of the same size are combined to form new n-grams, where the similarity is defined by positive amino acid substitution scores in the BLOSUM62 matrix. Substitution has resulted in a large increase in the number of discriminatory n-grams harvested. Due to the unbalanced nature of the dataset, the frequencies of the n-grams are normalized using a dampening factor, which gives more weightage to the n-grams that appear in fewer classes and vice-versa. After the n-grams are normalized, the scoring function identifies discriminative 4- to 8-grams for each class that are frequent enough to be above a selection threshold. By mapping these discriminative n-grams back to the protein sequences, we obtained contiguous n-grams that represent short class-specific motifs in protein sequences. Our method fared well compared to an existing motif finding method known as Wordspy. We have validated our enriched set of class-specific motifs against the functionally important motifs obtained from the NLSdb, Prosite and ELM databases. We demonstrate that this method is very generic; thus can be widely applied to detect class-specific motifs in many protein sequence classification tasks. Conclusion: The proposed scoring function and methodology is able to identify class-specific motifs using discriminative n-grams derived from the protein sequences. The implementation of amino acid substitution scores for similarity detection, and the dampening factor to normalize the unbalanced datasets have significant effect on the performance of the scoring function. Our multipronged validation tests demonstrate that this method can detect class-specific motifs from a wide variety of protein sequence classes with a potential application to detecting proteome-specific motifs of different organisms.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

An efficient mixed molecular dynamics/quantum mechanics model has been applied to the water cluster system. The use of the MP2 method and correlation consistent basis sets, with appropriate correction for BSSE, allows for the accurate calculation of electronic and free energies for the formation of clusters of 2−10 water molecules. This approach reveals new low energy conformers for (H2O)n=7,9,10. The water heptamer conformers comprise five different structural motifs ranging from a three-dimensional prism to a quasi-planar book structure. A prism-like structure is favored energetically at low temperatures, but a chair-like structure is the global Gibbs free energy minimum past 200 K. The water nonamers exhibit less complexity with all the low energy structures shaped like a prism. The decamer has 30 conformers that are within 2 kcal/mol of the Gibbs free energy minimum structure at 298 K. These structures are categorized into four conformer classes, and a pentagonal prism is the most stable structure from 0 to 320 K. Results can be used as benchmark values for empirical water models and density functionals, and the method can be applied to larger water clusters.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The PM3 quantum-mechanical method has been used to study large water clusters ranging from 8 to 42 water molecules. These large clusters are built from smaller building blocks. The building blocks include cyclic tetramers, pentamers, octamers, and a pentagonal dodecahedron cage. The correlations between the strain energy resulting from bending of the hydrogen bonds formed by different cluster motifs and the number of waters involved in the cluster are discussed. The PM3 results are compared with TIP4P potential and ab initio results. The number of net hydrogen bonds per water increases with the cluster size. This places a limit on the size of clusters that would fit the Benson model of liquid water. Many of the 20-mer clusters fit the Benson model well. Calculations of the ion cluster (H20)4o(H30+)2 reveal that the m/e ratio obtainable by mass spectrometry experiments can uniquely indicate the conformation of the 20 water pentagonal dodecahedron cage present in the larger clusters.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

An efficient mixed molecular dynamics/quantum mechanics model has been applied to the water cluster system. The use of the MP2 method and correlation consistent basis sets, with appropriate correction for BSSE, allows for the accurate calculation of electronic and free energies for the formation of clusters of 2−10 water molecules. This approach reveals new low energy conformers for (H2O)n=7,9,10. The water heptamer conformers comprise five different structural motifs ranging from a three-dimensional prism to a quasi-planar book structure. A prism-like structure is favored energetically at low temperatures, but a chair-like structure is the global Gibbs free energy minimum past 200 K. The water nonamers exhibit less complexity with all the low energy structures shaped like a prism. The decamer has 30 conformers that are within 2 kcal/mol of the Gibbs free energy minimum structure at 298 K. These structures are categorized into four conformer classes, and a pentagonal prism is the most stable structure from 0 to 320 K. Results can be used as benchmark values for empirical water models and density functionals, and the method can be applied to larger water clusters.