945 resultados para Protein structures
Clustering of Protein Structures Using Hydrophobic Free Energy And Solvent Accessibility of Proteins
Resumo:
Background Predicting protein subnuclear localization is a challenging problem. Some previous works based on non-sequence information including Gene Ontology annotations and kernel fusion have respective limitations. The aim of this work is twofold: one is to propose a novel individual feature extraction method; another is to develop an ensemble method to improve prediction performance using comprehensive information represented in the form of high dimensional feature vector obtained by 11 feature extraction methods. Methodology/Principal Findings A novel two-stage multiclass support vector machine is proposed to predict protein subnuclear localizations. It only considers those feature extraction methods based on amino acid classifications and physicochemical properties. In order to speed up our system, an automatic search method for the kernel parameter is used. The prediction performance of our method is evaluated on four datasets: Lei dataset, multi-localization dataset, SNL9 dataset and a new independent dataset. The overall accuracy of prediction for 6 localizations on Lei dataset is 75.2% and that for 9 localizations on SNL9 dataset is 72.1% in the leave-one-out cross validation, 71.7% for the multi-localization dataset and 69.8% for the new independent dataset, respectively. Comparisons with those existing methods show that our method performs better for both single-localization and multi-localization proteins and achieves more balanced sensitivities and specificities on large-size and small-size subcellular localizations. The overall accuracy improvements are 4.0% and 4.7% for single-localization proteins and 6.5% for multi-localization proteins. The reliability and stability of our classification model are further confirmed by permutation analysis. Conclusions It can be concluded that our method is effective and valuable for predicting protein subnuclear localizations. A web server has been designed to implement the proposed method. It is freely available at http://bioinformatics.awowshop.com/snlpred_page.php.
Resumo:
Recognizing similarities and deriving relationships among protein molecules is a fundamental requirement in present-day biology. Similarities can be present at various levels which can be detected through comparison of protein sequences or their structural folds. In some cases similarities obscure at these levels could be present merely in the substructures at their binding sites. Inferring functional similarities between protein molecules by comparing their binding sites is still largely exploratory and not as yet a routine protocol. One of the main reasons for this is the limitation in the choice of appropriate analytical tools that can compare binding sites with high sensitivity. To benefit from the enormous amount of structural data that is being rapidly accumulated, it is essential to have high throughput tools that enable large scale binding site comparison. Results: Here we present a new algorithm PocketMatch for comparison of binding sites in a frame invariant manner. Each binding site is represented by 90 lists of sorted distances capturing shape and chemical nature of the site. The sorted arrays are then aligned using an incremental alignment method and scored to obtain PMScores for pairs of sites. A comprehensive sensitivity analysis and an extensive validation of the algorithm have been carried out. A comparison with other site matching algorithms is also presented. Perturbation studies where the geometry of a given site was retained but the residue types were changed randomly, indicated that chance similarities were virtually non-existent. Our analysis also demonstrates that shape information alone is insufficient to discriminate between diverse binding sites, unless combined with chemical nature of amino acids. Conclusion: A new algorithm has been developed to compare binding sites in accurate, efficient and high-throughput manner. Though the representation used is conceptually simplistic, we demonstrate that along with the new alignment strategy used, it is sufficient to enable binding comparison with high sensitivity. Novel methodology has also been presented for validating the algorithm for accuracy and sensitivity with respect to geometry and chemical nature of the site. The method is also fast and takes about 1/250(th) second for one comparison on a single processor. A parallel version on BlueGene has also been implemented.
Resumo:
Amino acid sequences are known to constantly mutate and diverge unless there is a limiting condition that makes such a change deleterious. However, closer examination of the sequence and structure reveals that a few large, cryptic repeats are nevertheless sequentially conserved. This leads to the question of why only certain repeats are conserved at the sequence level. It would be interesting to find out if these sequences maintain their conservation at the three-dimensional structure level. They can play an active role in protein and nucleotide stability, thus not only ensring proper functioning but also potentiating malfunction and disease. Therefore, insights into any aspect of the repeats - be it structure, function or evolution - would prove to be of some importance. This study aims to address the relationship between protein sequence and its three-dimensional structure, by examining if large cryptic sequence repeats have the same structure.
Resumo:
The situation normally encountered in the high-resolution refinement of protein structures is one in which the inaccurate positions of P out of a total of N atoms are known whereas those of the remaining atoms are unknown. Fourier maps with coefficients (FN -- F'P) × exp (i[alpha]'P) and (mFN -- nF'P) exp (i[alpha]'P), where FN is the observed structure factor and F'P and [alpha]'P are the magnitude and the phase angle of the calculated structure factor corresponding to the inaccurate atomic positions, are often used to correct the positions of the P atoms and to determine those of the Q unknown atoms. A general theoretical approach is presented to elucidate the effect of errors in the positions of the known atoms on the corrected positions of the known atoms and the positions of the unknown atoms derived from such maps. The theory also leads to the optimal choice of parameters used in the different syntheses. When the errors in the positions of the input atoms are systematic, their effects are not taken care of automatically by the syntheses.
Resumo:
Comparative studies on protein structures form an integral part of protein crystallography. Here, a fast method of comparing protein structures is presented. Protein structures are represented as a set of secondary structural elements. The method also provides information regarding preferred packing arrangements and evolutionary dynamics of secondary structural elements. This information is not easily obtained from previous methods. In contrast to those methods, the present one can be used only for proteins with some secondary structure. The method is illustrated with globin folds, cytochromes and dehydrogenases as examples.
Resumo:
Geometric and structural constraints greatly restrict the selection of folds adapted by protein backbones, and yet, folded proteins show an astounding diversity in functionality. For structure to have any bearing on function, it is thus imperative that, apart from the protein backbone, other tunable degrees of freedom be accountable. Here, we focus on side-chain interactions, which non-covalently link amino acids in folded proteins to form a network structure. At a coarse-grained level, we show that the network conforms remarkably well to realizations of random graphs and displays associated percolation behavior. Thus, within the rigid framework of the protein backbone that restricts the structure space, the side-chain interactions exhibit an element of randomness, which account for the functional flexibility and diversity shown by proteins. However, at a finer level, the network exhibits deviations from these random graphs which, as we demonstrate for a few specific examples, reflect the intrinsic uniqueness in the structure and stability, and perhaps specificity in the functioning of biological proteins.
Resumo:
MIPS (metal interactions in protein structures) is a database of metals in the three-dimensional acromolecular structures available in the Protein Data Bank. Bound metal ions in proteins have both catalytic and structural functions. The proposed database serves as an open resource for the analysis and visualization of all metals and their interactions with macromolecular (protein and nucleic acid) structures. MIPS can be searched via a user-friendly interface, and the interactions between metals and protein molecules, and the geometric parameters, can be viewed in both textual and graphical format using the freely available graphics plug-in Jmol. MIPS is updated regularly, by means of programmed scripts to find metal-containing proteins from newly released protein structures. The database is useful for studying the properties of coordination between metals and protein molecules. It also helps to improve understanding of the relationship between macromolecular structure and function. This database is intended to serve the scientific community working in the areas of chemical and structural biology, and is freely available to all users, around the clock, at http://dicsoft2.physics.iisc.ernet.in/mips/.
Resumo:
Ion pairs contribute to several functions including the activity of catalytic triads, fusion of viral membranes, stability in thermophilic proteins and solvent-protein interactions. Furthermore, they have the ability to affect the stability of protein structures and are also a part of the forces that act to hold monomers together. This paper deals with the possible ion pair combinations and networks in 25% and 90% non-redundant protein chains. Different types of ion pairs present in various secondary structural elements are analysed. The ion pairs existing between different subunits of multisubunit protein structures are also computed and the results of various analyses are presented in detail. The protein structures used in the analysis are solved using X-ray crystallography, whose resolution is better than or equal to 1.5 angstrom and R-factor better than or equal to 20%. This study can, therefore, be useful for analyses of many protein functions. It also provides insights into the better understanding of the architecture of protein structure.
Resumo:
This study views each protein structure as a network of noncovalent connections between amino acid side chains. Each amino acid in a protein structure is a node, and the strength of the noncovalent interactions between two amino acids is evaluated for edge determination. The protein structure graphs (PSGs) for 232 proteins have been constructed as a function of the cutoff of the amino acid interaction strength at a few carefully chosen values. Analysis of such PSGs constructed on the basis of edge weights has shown the following: 1), The PSGs exhibit a complex topological network behavior, which is dependent on the interaction cutoff chosen for PSG construction. 2), A transition is observed at a critical interaction cutoff, in all the proteins, as monitored by the size of the largest cluster (giant component) in the graph. Amazingly, this transition occurs within a narrow range of interaction cutoff for all the proteins, irrespective of the size or the fold topology. And 3), the amino acid preferences to be highly connected (hub frequency) have been evaluated as a function of the interaction cutoff. We observe that the aromatic residues along with arginine, histidine, and methionine act as strong hubs at high interaction cutoffs, whereas the hydrophobic leucine and isoleucine residues get added to these hubs at low interaction cutoffs, forming weak hubs. The hubs identified are found to play a role in bringing together different secondary structural elements in the tertiary structure of the proteins. They are also found to contribute to the additional stability of the thermophilic proteins when compared to their mesophilic counterparts and hence could be crucial for the folding and stability of the unique three-dimensional structure of proteins. Based on these results, we also predict a few residues in the thermophilic and mesophilic proteins that can be mutated to alter their thermal stability.
Resumo:
An analysis of the nature and distribution of disallowed Ramachandran conformations of amino acid residues observed in high resolution protein crystal structures has been carried out. A data set consisting of 110 high resolution, non-homologous, protein crystal structures from the Brookhaven Protein Data Bank was examined. The data set consisted of a total of 18,708 non-Gly residues, which were characterized on the basis of their backbone dihedral angles (φ, ψ). Residues falling outside the defined “broad allowed limits” on the Ramachandran map were chosen and the reportedB-factor value of the α-carbon atom was used to further select well defined disallowed conformations. The conformations of the selected 66 disallowed residues clustered in distinct regions of the Ramachandran map indicating that specific φ, ψ angle distortions are preferred under compulsions imposed by local constraints. The distribution of various amino acid residues in the disallowed residue data set showed a predominance of small polar/charged residues, with bulky hydrophobic residues being infrequent. As a further check, for all the 66 cases non-hydrogen van der Waals short contacts in the protein structures were evaluated and compared with the ideal “Ala-dipeptide” constructed using disallowed dihedral angle (φ, ψ) values. The analysis reveals that short contacts are eliminated in most cases by local distortions of bond angles. An analysis of the conformation of the identified disallowed residues in related protein structures reveals instances of conservation of unusual stereochemistry.
Resumo:
Protein structure validation is an important step in computational modeling and structure determination. Stereochemical assessment of protein structures examine internal parameters such as bond lengths and Ramachandran (phi, psi) angles. Gross structure prediction methods such as inverse folding procedure and structure determination especially at low resolution can sometimes give rise to models that are incorrect due to assignment of misfolds or mistracing of electron density maps. Such errors are not reflected as strain in internal parameters. HARMONY is a procedure that examines the compatibility between the sequence and the structure of a protein by assigning scores to individual residues and their amino acid exchange patterns after considering their local environments. Local environments are described by the backbone conformation, solvent accessibility and hydrogen bonding patterns. We are now providing HARMONY through a web server such that users can submit their protein structure files and, if required, the alignment of homologous sequences. Scores are mapped on the structure for subsequent examination that is useful to also recognize regions of possible local errors in protein structures. HARMONY server is located at http://caps.ncbs.res.in/harmony/
Resumo:
The interaction of the protein atoms with the surrounding water oxygen atoms has been computed for 392 protein chains from 369 protein structures belonging to 90% non-homologous high resolution (<= 1.5 angstrom) protein Structures with a crystallographic R-factor <= 20%. The percentage composition of the polar atoms is found to be 36.3%. An average of 82.55% of water oxygen atoms are found to be in the primary hydration shell and 15.12% in the secondary hydration shell. The average Percentage of interactions of water oxygen atoms with the polar atoms of the main chain and side chain are 54% and 46%. respectively. The interaction of the acidic residues, aspartate and glutamate, with the water oxygen atoms is more when compared to that of the other residues.
Resumo:
Ligand-induced conformational changes in proteins are of immense functional relevance. It is a major challenge to elucidate the network of amino acids that are responsible for the percolation of ligand-induced conformational changes to distal regions in the protein from a global perspective. Functionally important subtle conformational changes (at the level of side-chain noncovalent interactions) upon ligand binding or as a result of environmental variations are also elusive in conventional studies such as those using root-mean-square deviations (r.m.s.d.s). In this article, the network representation of protein structures and their analyses provides an efficient tool to capture these variations (both drastic and subtle) in atomistic detail in a global milieu. A generalized graph theoretical metric, using network parameters such as cliques and/or communities, is used to determine similarities or differences between structures in a rigorous manner. The ligand-induced global rewiring in the protein structures is also quantified in terms of network parameters. Thus, a judicious use of graph theory in the context of protein structures can provide meaningful insights into global structural reorganizations upon perturbation and can also be helpful for rigorous structural comparison. Data sets for the present study include high-resolution crystal structures of serine proteases from the S1A family and are probed to quantify the ligand-induced subtle structural variations.
Resumo:
It is well known that water molecules play an indispensable role in the structure and function of biological macromolecules. The water-mediated ionic interactions between the charged residues provide stability and plasticity and in turn address the function of the protein structures. Thus, this study specifically addresses the number of possible water-mediated ionic interactions, their occurrence, distribution and nature found in 90% non-redundant protein chains. Further, it provides a statistical report of different charged residue pairs that are mediated by surface or buried water molecules to form the interactions. Also, it discusses its contributions in stabilizing various secondary structural elements of the protein. Thus, the present study shows the ubiquitous nature of the interactions that imparts plasticity and flexibility to a protein molecule.