13 resultados para Structure Prediction Servers
em National Center for Biotechnology Information - NCBI
Resumo:
In this study, we estimate the statistical significance of structure prediction by threading. We introduce a single parameter ɛ that serves as a universal measure determining the probability that the best alignment is indeed a native-like analog. Parameter ɛ takes into account both length and composition of the query sequence and the number of decoys in threading simulation. It can be computed directly from the query sequence and potential of interactions, eliminating the need for sequence reshuffling and realignment. Although our theoretical analysis is general, here we compare its predictions with the results of gapless threading. Finally we estimate the number of decoys from which the native structure can be found by existing potentials of interactions. We discuss how this analysis can be extended to determine the optimal gap penalties for any sequence-structure alignment (threading) method, thus optimizing it to maximum possible performance.
Resumo:
Progress in homology modeling and protein design has generated considerable interest in methods for predicting side-chain packing in the hydrophobic cores of proteins. Present techniques are not practically useful, however, because they are unable to model protein main-chain flexibility. Parameterization of backbone motions may represent a general and efficient method to incorporate backbone relaxation into such fixed main-chain models. To test this notion, we introduce a method for treating explicitly the backbone motions of alpha-helical bundles based on an algebraic parameterization proposed by Francis Crick in 1953 [Crick, F. H. C. (1953) Acta Crystallogr. 6, 685-689]. Given only the core amino acid sequence, a simple calculation can rapidly reproduce the crystallographic main-chain and core side-chain structures of three coiled coils (one dimer, one trimer, and one tetramer) to within 0.6-A root-mean-square deviations. The speed of the predictive method [approximately 3 min per rotamer choice on a Silicon Graphics (Mountain View, CA) 4D/35 computer] permits it to be used as a design tool.
Resumo:
Recent improvements of a hierarchical ab initio or de novo approach for predicting both α and β structures of proteins are described. The united-residue energy function used in this procedure includes multibody interactions from a cumulant expansion of the free energy of polypeptide chains, with their relative weights determined by Z-score optimization. The critical initial stage of the hierarchical procedure involves a search of conformational space by the conformational space annealing (CSA) method, followed by optimization of an all-atom model. The procedure was assessed in a recent blind test of protein structure prediction (CASP4). The resulting lowest-energy structures of the target proteins (ranging in size from 70 to 244 residues) agreed with the experimental structures in many respects. The entire experimental structure of a cyclic α-helical protein of 70 residues was predicted to within 4.3 Å α-carbon (Cα) rms deviation (rmsd) whereas, for other α-helical proteins, fragments of roughly 60 residues were predicted to within 6.0 Å Cα rmsd. Whereas β structures can now be predicted with the new procedure, the success rate for α/β- and β-proteins is lower than that for α-proteins at present. For the β portions of α/β structures, the Cα rmsd's are less than 6.0 Å for contiguous fragments of 30–40 residues; for one target, three fragments (of length 10, 23, and 28 residues, respectively) formed a compact part of the tertiary structure with a Cα rmsd less than 6.0 Å. Overall, these results constitute an important step toward the ab initio prediction of protein structure solely from the amino acid sequence.
Resumo:
Local protein structure prediction efforts have consistently failed to exceed approximately 70% accuracy. We characterize the degeneracy of the mapping from local sequence to local structure responsible for this failure by investigating the extent to which similar sequence segments found in different proteins adopt similar three-dimensional structures. Sequence segments 3-15 residues in length from 154 different protein families are partitioned into neighborhoods containing segments with similar sequences using cluster analysis. The consistency of the sequence-to-structure mapping is assessed by comparing the local structures adopted by sequence segments in the same neighborhood in proteins of known structure. In the 154 families, 45% and 28% of the positions occur in neighborhoods in which one and two local structures predominate, respectively. The sequence patterns that characterize the neighborhoods in the first class probably include virtually all of the short sequence motifs in proteins that consistently occur in a particular local structure. These patterns, many of which occur in transitions between secondary structural elements, are an interesting combination of previously studied and novel motifs. The identification of sequence patterns that consistently occur in one or a small number of local structures in proteins should contribute to the prediction of protein structure from sequence.
Resumo:
The hierarchical properties of potential energy landscapes have been used to gain insight into thermodynamic and kinetic properties of protein ensembles. It also may be possible to use them to direct computational searches for thermodynamically stable macroscopic states, i.e., computational protein folding. To this end, we have developed a top-down search procedure in which conformation space is recursively dissected according to the intrinsic hierarchical structure of a landscape's effective-energy barriers. This procedure generates an inverted tree similar to the disconnectivity graphs generated by local minima-clustering methods, but it fundamentally differs in the manner in which the portion of the tree that is to be computationally explored is selected. A key ingredient is a branch-selection algorithm that takes advantage of statistically predictive properties of the landscape to guide searches down the tree branches that are most likely to lead to the physically relevant macroscopic states. Using the computational folding of a β-hairpin-forming peptide as an example, we show that such predictive properties indeed exist and can be used for structure prediction by free-energy global minimization.
Resumo:
Single-stranded regions in RNA secondary structure are important for RNA–RNA and RNA–protein interactions. We present a probability profile approach for the prediction of these regions based on a statistical algorithm for sampling RNA secondary structures. For the prediction of phylogenetically-determined single-stranded regions in secondary structures of representative RNA sequences, the probability profile offers substantial improvement over the minimum free energy structure. In designing antisense oligonucleotides, a practical problem is how to select a secondary structure for the target mRNA from the optimal structure(s) and many suboptimal structures with similar free energies. By summarizing the information from a statistical sample of probable secondary structures in a single plot, the probability profile not only presents a solution to this dilemma, but also reveals ‘well-determined’ single-stranded regions through the assignment of probabilities as measures of confidence in predictions. In antisense application to the rabbit β-globin mRNA, a significant correlation between hybridization potential predicted by the probability profile and the degree of inhibition of in vitro translation suggests that the probability profile approach is valuable for the identification of effective antisense target sites. Coupling computational design with DNA–RNA array technique provides a rational, efficient framework for antisense oligonucleotide screening. This framework has the potential for high-throughput applications to functional genomics and drug target validation.
Resumo:
A method for the quantitative estimation of instability with respect to deamidation of the asparaginyl (Asn) residues in proteins is described. The procedure involves the observation of several simple aspects of the three-dimensional environment of each Asn residue in the protein and a calculation that includes these observations, the primary amino acid residue sequence, and the previously reported complete set of sequence-dependent rates of deamidation for Asn pentapeptides. This method is demonstrated and evaluated for 23 proteins in which 31 unstable and 167 stable Asn residues have been reported and for 7 unstable and 63 stable Asn residues that have been reported in 61 human hemoglobin variants. The relative importance of primary structure and three-dimensional structure in Asn deamidation is estimated.
Resumo:
The diffusion equation method of global minimization is applied to compute the crystal structure of S6, with no a priori knowledge about the system. The experimental lattice parameters and positions and orientations of the molecules in the unit cell are predicted correctly.
Resumo:
Operon structure is an important organization feature of bacterial genomes. Many sets of genes occur in the same order on multiple genomes; these conserved gene groupings represent candidate operons. This study describes a computational method to estimate the likelihood that such conserved gene sets form operons. The method was used to analyze 34 bacterial and archaeal genomes, and yielded more than 7600 pairs of genes that are highly likely (P ≥ 0.98) to belong to the same operon. The sensitivity of our method is 30–50% for the Escherichia coli genome. The predicted gene pairs are available from our World Wide Web site http://www.tigr.org/tigr-scripts/operons/operons.cgi.
Resumo:
The small viscosity asymptotics of the inertial range of local structure and of the wall region in wallbounded turbulent shear flow are compared. The comparison leads to a sharpening of the dichotomy between Reynolds number dependent scaling (power-type) laws and the universal Reynolds number independent logarithmic law in wall turbulence. It further leads to a quantitative prediction of an essential difference between them, which is confirmed by the results of a recent experimental investigation. These results lend support to recent work on the zero viscosity limit of the inertial range in turbulence.
Resumo:
We present a method for predicting protein folding class based on global protein chain description and a voting process. Selection of the best descriptors was achieved by a computer-simulated neural network trained on a data base consisting of 83 folding classes. Protein-chain descriptors include overall composition, transition, and distribution of amino acid attributes, such as relative hydrophobicity, predicted secondary structure, and predicted solvent exposure. Cross-validation testing was performed on 15 of the largest classes. The test shows that proteins were assigned to the correct class (correct positive prediction) with an average accuracy of 71.7%, whereas the inverse prediction of proteins as not belonging to a particular class (correct negative prediction) was 90-95% accurate. When tested on 254 structures used in this study, the top two predictions contained the correct class in 91% of the cases.
Resumo:
The regions surrounding the catalytic amino acids previously identified in a few "retaining" O-glycosyl hydrolases (EC 3.2.1) have been analyzed by hydrophobic cluster analysis and have been used to define sequence motifs. These motifs have been found in more than 150 glycosyl hydrolase sequences representing at least eight established protein families that act on a large variety of substrates. This allows the localization and the precise role of the catalytic residues (nucleophile and acid catalyst) to be predicted for each of these enzymes, including several lysosomal glycosidases. An identical arrangement of the catalytic nucleophile was also found for S-glycosyl hydrolases (myrosinases; EC 3.2.3.1) for which the acid catalyst is lacking. A (beta/alpha)8 barrel structure has been reported for two of the eight families of proteins that have been grouped. It is suggested that the six other families also share this fold at their catalytic domain. These enzymes illustrate how evolutionary events led to a wide diversification of substrate specificity with a similar disposition of identical catalytic residues onto the same ancestral (beta/alpha)8 barrel structure.
Resumo:
Sequence analysis of peptides naturally presented by major histocompatibility complex (MHC) class I molecules has revealed allele-specific motifs in which the peptide length and the residues observed at certain positions are restricted. Nevertheless, peptides containing the standard motif often fail to bind with high affinity or form physiologically stable complexes. Here we present the crystal structure of a well-characterized antigenic peptide from ovalbumin [OVA-8, ovalbumin-(257-264), SIINFEKL] in complex with the murine MHC class I H-2Kb molecule at 2.5-A resolution. Hydrophobic peptide residues Ile-P2 and Phe-P5 are packed closely together into binding pockets B and C, suggesting that the interplay of peptide anchor (P5) and secondary anchor (P2) residues can couple the preferred sequences at these positions. Comparison with the crystal structures of H-2Kb in complex with peptides VSV-8 (RGYVYQGL) and SEV-9 (FAPGNYPAL), where a Tyr residue is used as the C pocket anchor, reveals that the conserved water molecule that binds into the B pocket and mediates hydrogen bonding from the buried anchor hydroxyl group could not be likewise positioned if the P2 side chain were of significant size. Based on this structural evidence, H-2Kb has at least two submotifs: one with Tyr at P5 (or P6 for nonamer peptides) and a small residue at P2 (i.e., Ala or Gly) and another with Phe at P5 and a medium-sized hydrophobic residue at P2 (i.e., Ile). Deciphering of these secondary submotifs from both crystallographic and immunological studies of MHC peptide binding should increase the accuracy of T-cell epitope prediction.