65 resultados para Structure Prediction Servers
em Indian Institute of Science - Bangalore - Índia
Resumo:
Sequence-structure correlation studies are important in deciphering the relationships between various structural aspects, which may shed light on the protein-folding problem. The first step of this process is the prediction of secondary structure for a protein sequence of unknown three-dimensional structure. To this end, a web server has been created to predict the consensus secondary structure using well known algorithms from the literature. Furthermore, the server allows users to see the occurrence of predicted secondary structural elements in other structure and sequence databases and to visualize predicted helices as a helical wheel plot. The web server is accessible at http://bioserver1.physics.iisc.ernet.in/cssp/.
Resumo:
The notion of structure is central to the subject of chemistry. This review traces the development of the idea of crystal structure since the time when a crystal structure could be determined from a three-dimensional diffraction pattern and assesses the feasibility of computationally predicting an unknown crystal structure of a given molecule. Crystal structure prediction is of considerable fundamental and applied importance, and its successful execution is by no means a solved problem. The ease of crystal structure determination today has resulted in the availability of large numbers of crystal structures of higher-energy polymorphs and pseudopolymorphs. These structural libraries lead to the concept of a crystal structure landscape. A crystal structure of a compound may accordingly be taken as a data point in such a landscape.
Resumo:
Acta Crystallographica Section A: Foundations of Crystallography covers theoretical and fundamental aspects of the structure of matter. The journal is the prime forum for research in diffraction physics and the theory of crystallographic structure determination by diffraction methods using X-rays, neutrons and electrons. The structures include periodic and aperiodic crystals, and non-periodic disordered materials, and the corresponding Bragg, satellite and diffuse scattering, thermal motion and symmetry aspects. Spatial resolutions range from the subatomic domain in charge-density studies to nanodimensional imperfections such as dislocations and twin walls. The chemistry encompasses metals, alloys, and inorganic, organic and biological materials. Structure prediction and properties such as the theory of phase transformations are also covered.
Resumo:
The rapidly growing structure databases enhance the probability of finding identical sequences sharing structural similarity. Structure prediction methods are being used extensively to abridge the gap between known protein sequences and the solved structures which is essential to understand its specific biochemical and cellular functions. In this work, we plan to study the ambiguity between sequence-structure relationships and examine if sequentially identical peptide fragments adopt similar three-dimensional structures. Fragments of varying lengths (five to ten residues) were used to observe the behavior of sequence and its three-dimensional structures. The STAMP program was used to superpose the three-dimensional structures and the two parameters (Sequence Structure Similarity Score (Sc) and Root Mean Square Deviation value) were employed to classify them into three categories: similar, intermediate and dissimilar structures. Furthermore, the same approach was carried out on all the three-dimensional protein structures solved in the two organisms, Mycobacterium tuberculosis and Plasmodium falciparum to validate our results.
Resumo:
The availability of the genome sequence of Mycobacterium tuberculosis H37Rv has encouraged determination of large numbers of protein structures and detailed definition of the biological information encoded therein; yet, the functions of many proteins in M. tuberculosis remain unknown. The emergence of multidrug resistant strains makes it a priority to exploit recent advances in homology recognition and structure prediction to re-analyse its gene products. Here we report the structural and functional characterization of gene products encoded in the M. tuberculosis genome, with the help of sensitive profile-based remote homology search and fold recognition algorithms resulting in an enhanced annotation of the proteome where 95% of the M. tuberculosis proteins were identified wholly or partly with information on structure or function. New information includes association of 244 proteins with 205 domain families and a separate set of new association of folds to 64 proteins. Extending structural information across uncharacterized protein families represented in the M. tuberculosis proteome, by determining superfamily relationships between families of known and unknown structures, has contributed to an enhancement in the knowledge of structural content. In retrospect, such superfamily relationships have facilitated recognition of probable structure and/or function for several uncharacterized protein families, eventually aiding recognition of probable functions for homologous proteins corresponding to such families. Gene products unique to mycobacteria for which no functions could be identified are 183. Of these 18 were determined to be M. tuberculosis specific. Such pathogen-specific proteins are speculated to harbour virulence factors required for pathogenesis. A re-annotated proteome of M. tuberculosis, with greater completeness of annotated proteins and domain assigned regions, provides a valuable basis for experimental endeavours designed to obtain a better understanding of pathogenesis and to accelerate the process of drug target discovery. (C) 2014 Elsevier Ltd. All rights reserved.
Pi-turns in proteins and peptides: Classification, conformation, occurrence, hydration and sequence.
Resumo:
The i + 5-->i hydrogen bonded turn conformation (pi-turn) with the fifth residue adopting alpha L conformation is frequently found at the C-terminus of helices in proteins and hence is speculated to be a "helix termination signal." An analysis of the occurrence of i + 5-->i hydrogen bonded turn conformation at any general position in proteins (not specifically at the helix C-terminus), using coordinates of 228 protein crystal structures determined by X-ray crystallography to better than 2.5 A resolution is reported in this paper. Of 486 detected pi-turn conformations, 367 have the (i + 4)th residue in alpha L conformation, generally occurring at the C-terminus of alpha-helices, consistent with previous observations. However, a significant number (111) of pi-turn conformations occur with (i + 4)th residue in alpha R conformation also, generally occurring in alpha-helices as distortions either at the terminii or at the middle, a novel finding. These two sets of pi-turn conformations are referred to by the names pi alpha L and pi alpha R-turns, respectively, depending upon whether the (i + 4)th residue adopts alpha L or alpha R conformations. Four pi-turns, named pi alpha L'-turns, were noticed to be mirror images of pi alpha L-turns, and four more pi-turns, which have the (i + 4)th residue in beta conformation and denoted as pi beta-turns, occur as a part of hairpin bend connecting twisted beta-strands. Consecutive pi-turns occur, but only with pi alpha R-turns. The preference for amino acid residues is different in pi alpha L and pi alpha R-turns. However, both show a preference for Pro after the C-termini. Hydrophilic residues are preferred at positions i + 1, i + 2, and i + 3 of pi alpha L-turns, whereas positions i and i + 5 prefer hydrophobic residues. Residue i + 4 in pi alpha L-turns is mainly Gly and less often Asn. Although pi alpha R-turns generally occur as distortions in helices, their amino acid preference is different from that of helices. Poor helix formers, such as His, Tyr, and Asn, also were found to be preferred for pi alpha R-turns, whereas good helix former Ala is not preferred. pi-Turns in peptides provide a picture of the pi-turn at atomic resolution. Only nine peptide-based pi-turns are reported so far, and all of them belong to pi alpha L-turn type with an achiral residue in position i + 4. The results are of importance for structure prediction, modeling, and de novo design of proteins.
Resumo:
We present a new computationally efficient method for large-scale polypeptide folding using coarse-grained elastic networks and gradient-based continuous optimization techniques. The folding is governed by minimization of energy based on Miyazawa–Jernigan contact potentials. Using this method we are able to substantially reduce the computation time on ordinary desktop computers for simulation of polypeptide folding starting from a fully unfolded state. We compare our results with available native state structures from Protein Data Bank (PDB) for a few de-novo proteins and two natural proteins, Ubiquitin and Lysozyme. Based on our simulations we are able to draw the energy landscape for a small de-novo protein, Chignolin. We also use two well known protein structure prediction software, MODELLER and GROMACS to compare our results. In the end, we show how a modification of normal elastic network model can lead to higher accuracy and lower time required for simulation.
Resumo:
The TCP transcription factors control multiple developmental traits in diverse plant species. Members of this family share an similar to 60-residue-long TCP domain that binds to DNA. The TCP domain is predicted to form a basic helix-loop-helix ( bHLH) structure but shares little sequence similarity with canonical bHLH domain. This classifies the TCP domain as a novel class of DNA binding domain specific to the plant kingdom. Little is known about how the TCP domain interacts with its target DNA. We report biochemical characterization and DNA binding properties of a TCP member in Arabidopsis thaliana, TCP4. We have shown that the 58-residue domain of TCP4 is essential and sufficient for binding to DNA and possesses DNA binding parameters comparable to canonical bHLH proteins. Using a yeast-based random mutagenesis screen and site-directed mutants, we identified the residues important for DNA binding and dimer formation. Mutants defective in binding and dimerization failed to rescue the phenotype of an Arabidopsis line lacking the endogenous TCP4 activity. By combining structure prediction, functional characterization of the mutants, and molecular modeling, we suggest a possible DNA binding mechanism for this class of transcription factors.
Resumo:
Protein structure validation is an important step in computational modeling and structure determination. Stereochemical assessment of protein structures examine internal parameters such as bond lengths and Ramachandran (phi, psi) angles. Gross structure prediction methods such as inverse folding procedure and structure determination especially at low resolution can sometimes give rise to models that are incorrect due to assignment of misfolds or mistracing of electron density maps. Such errors are not reflected as strain in internal parameters. HARMONY is a procedure that examines the compatibility between the sequence and the structure of a protein by assigning scores to individual residues and their amino acid exchange patterns after considering their local environments. Local environments are described by the backbone conformation, solvent accessibility and hydrogen bonding patterns. We are now providing HARMONY through a web server such that users can submit their protein structure files and, if required, the alignment of homologous sequences. Scores are mapped on the structure for subsequent examination that is useful to also recognize regions of possible local errors in protein structures. HARMONY server is located at http://caps.ncbs.res.in/harmony/
Resumo:
A successful protein-protein docking study culminates in identification of decoys at top ranks with near-native quaternary structures. However, this task remains enigmatic because no generalized scoring functions exist that effectively infer decoys according to the similarity to near-native quaternary structures. Difficulties arise because of the highly irregular nature of the protein surface and the significant variation of the nonbonding and solvation energies based on the chemical composition of the protein-protein interface. In this work, we describe a novel method combining an interface-size filter, a regression model for geometric compatibility (based on two correlated surface and packing parameters), and normalized interaction energy (calculated from correlated nonbonded and solvation energies), to effectively rank decoys from a set of 10,000 decoys. Tests on 30 unbound binary protein-protein complexes show that in 16 cases we can identify at least one decoy in top three ranks having <= 10 angstrom backbone root mean square deviation from true binding geometry. Comparisons with other state-of-art methods confirm the improved ranking power of our method without the use of any experiment-guided restraints, evolutionary information, statistical propensities, or modified interaction energy equations. Tests on 118 less-difficult bound binary protein-protein complexes with <= 35% sequence redundancy at the interface showed that in 77% cases, at least 1 in 10,000 decoys were identified with <= 5 angstrom backbone root mean square deviation from true geometry at first rank. The work will promote the use of new concepts where correlations among parameters provide more robust scoring models. It will facilitate studies involving molecular interactions, including modeling of large macromolecular assemblies and protein structure prediction. (C) 2010 Wiley Periodicals, Inc. J Comput Chem 32: 787-796, 2011.
Resumo:
Repeats are two or more contiguous segments of amino acid residues that are believed to have arisen as a result of intragenic duplication, recombination and mutation events. These repeats can be utilized for protein structure prediction and can provide insights into the protein evolution and phylogenetic relationship. Therefore, to aid structural biologists and phylogeneticists in their research, a computing resource (a web server and a database), Repeats in Protein Sequences (RPS), has been created. Using RPS, users can obtain useful information regarding identical, similar and distant repeats (of varying lengths) in protein sequences. In addition, users can check the frequency of occurrence of the repeats in sequence databases such as the Genome Database, PIR and SWISS-PROT and among the protein sequences available in the Protein Data Bank archive. Furthermore, users can view the three-dimensional structure of the repeats using the Java visualization plug-in Jmol. The proposed computing resource can be accessed over the World Wide Web at http://bioserver1.physics.iisc.ernet.in/rps/.
Resumo:
A major bottleneck in protein structure prediction is the selection of correct models from a pool of decoys. Relative activities of similar to 1,200 individual single-site mutants in a saturation library of the bacterial toxin CcdB were estimated by determining their relative populations using deep sequencing. This phenotypic information was used to define an empirical score for each residue (Rank Score), which correlated with the residue depth, and identify active-site residues. Using these correlations, similar to 98% of correct models of CcdB (RMSD <= 4 angstrom) were identified from a large set of decoys. The model-discrimination methodology was further validated on eleven different monomeric proteins using simulated RankScore values. The methodology is also a rapid, accurate way to obtain relative activities of each mutant in a large pool and derive sequence-structure-function relationships without protein isolation or characterization. It can be applied to any system in which mutational effects can be monitored by a phenotypic readout.
Resumo:
Convergence of the vast sequence space of proteins into a highly restricted fold/conformational space suggests a simple yet unique underlying mechanism of protein folding that has been the subject of much debate in the last several decades. One of the major challenges related to the understanding of protein folding or in silico protein structure prediction is the discrimination of non-native structures/decoys from the native structure. Applications of knowledge-based potentials to attain this goal have been extensively reported in the literature. Also, scoring functions based on accessible surface area and amino acid neighbourhood considerations were used in discriminating the decoys from native structures. In this article, we have explored the potential of protein structure network (PSN) parameters to validate the native proteins against a large number of decoy structures generated by diverse methods. We are guided by two principles: (a) the PSNs capture the local properties from a global perspective and (b) inclusion of non-covalent interactions, at all-atom level, including the side-chain atoms, in the network construction accommodates the sequence dependent features. Several network parameters such as the size of the largest cluster, community size, clustering coefficient are evaluated and scored on the basis of the rank of the native structures and the Z-scores. The network analysis of decoy structures highlights the importance of the global properties contributing to the uniqueness of native structures. The analysis also exhibits that the network parameters can be used as metrics to identify the native structures and filter out non-native structures/decoys in a large number of data-sets; thus also has a potential to be used in the protein `structure prediction' problem.
Resumo:
Protein structure space is believed to consist of a finite set of discrete folds, unlike the protein sequence space which is astronomically large, indicating that proteins from the available sequence space are likely to adopt one of the many folds already observed. In spite of extensive sequence-structure correlation data, protein structure prediction still remains an open question with researchers having tried different approaches (experimental as well as computational). One of the challenges of protein structure prediction is to identify the native protein structures from a milieu of decoys/models. In this work, a rigorous investigation of Protein Structure Networks (PSNs) has been performed to detect native structures from decoys/ models. Ninety four parameters obtained from network studies have been optimally combined with Support Vector Machines (SVM) to derive a general metric to distinguish decoys/models from the native protein structures with an accuracy of 94.11%. Recently, for the first time in the literature we had shown that PSN has the capability to distinguish native proteins from decoys. A major difference between the present work and the previous study is to explore the transition profiles at different strengths of non-covalent interactions and SVM has indeed identified this as an important parameter. Additionally, the SVM trained algorithm is also applied to the recent CASP10 predicted models. The novelty of the network approach is that it is based on general network properties of native protein structures and that a given model can be assessed independent of any reference structure. Thus, the approach presented in this paper can be valuable in validating the predicted structures. A web-server has been developed for this purpose and is freely available at http://vishgraph.mbu.iisc.ernet.in/GraProStr/PSN-QA.html.
Resumo:
A regular secondary structure is described by a well defined set of values for the backbone dihedral angles (phi,psi and omega) in a polypeptide chain. However in real protein structures small local variations give rise to distortions from the ideal structures, which can lead to considerable variation in higher order organization. Protein structure analysis and accurate assignment of various structural elements, especially their terminii, are important first step in protein structure prediction and design. Various algorithms are available for assigning secondary structure elements in proteins but some lacunae still exist. In this study, results of a recently developed in-house program ASSP have been compared with those from STRIDE, in identification of alpha-helical regions in both globular and membrane proteins. It is found that, while a combination of hydrogen bond patterns and backbone torsional angles (phi-psi) are generally used to define secondary structure elements, the geometry of the C-alpha atom trace by itself is sufficient to define the parameters of helical structures in proteins. It is also possible to differentiate the various helical structures by their C-alpha trace and identify the deviations occurring both at mid-positions as well as at the terminii of alpha-helices, which often lead to occurrence of 3(10) and pi-helical fragments in both globular and membrane proteins.