974 resultados para Structure Prediction
Resumo:
Background: Protein tertiary structure can be partly characterized via each amino acid's contact number measuring how residues are spatially arranged. The contact number of a residue in a folded protein is a measure of its exposure to the local environment, and is defined as the number of C-beta atoms in other residues within a sphere around the C-beta atom of the residue of interest. Contact number is partly conserved between protein folds and thus is useful for protein fold and structure prediction. In turn, each residue's contact number can be partially predicted from primary amino acid sequence, assisting tertiary fold analysis from sequence data. In this study, we provide a more accurate contact number prediction method from protein primary sequence. Results: We predict contact number from protein sequence using a novel support vector regression algorithm. Using protein local sequences with multiple sequence alignments (PSI-BLAST profiles), we demonstrate a correlation coefficient between predicted and observed contact numbers of 0.70, which outperforms previously achieved accuracies. Including additional information about sequence weight and amino acid composition further improves prediction accuracies significantly with the correlation coefficient reaching 0.73. If residues are classified as being either contacted or non-contacted, the prediction accuracies are all greater than 77%, regardless of the choice of classification thresholds. Conclusion: The successful application of support vector regression to the prediction of protein contact number reported here, together with previous applications of this approach to the prediction of protein accessible surface area and B-factor profile, suggests that a support vector regression approach may be very useful for determining the structure-function relation between primary sequence and higher order consecutive protein structural and functional properties.
Resumo:
Protein structure prediction is a cornerstone of bioinformatics research. Membrane proteins require their own prediction methods due to their intrinsically different composition. A variety of tools exist for topology prediction of membrane proteins, many of them available on the Internet. The server described in this paper, BPROMPT (Bayesian PRediction Of Membrane Protein Topology), uses a Bayesian Belief Network to combine the results of other prediction methods, providing a more accurate consensus prediction. Topology predictions with accuracies of 70% for prokaryotes and 53% for eukaryotes were achieved. BPROMPT can be accessed at http://www.jenner.ac.uk/BPROMPT.
Resumo:
Phospholipases A(2) (PLA(2)) are enzymes commonly found in snake venoms from Viperidae and Elaphidae families, which are major components thereof. Many plants are used in traditional medicine its active agents against various effects induced by snakebite. This article presents the PLA(2) BthTX-I structure prediction based on homology modeling. In addition, we have performed virtual screening in a large database yielding a set of potential bioactive inhibitors. A flexible docking program was used to investigate the interactions between the receptor and the new ligands. We have performed molecular interaction fields (MIFs) calculations with the phospholipase model. Results confirm the important role of Lys49 for binding ligands and suggest three additional residues as well. We have proposed a theoretically nontoxic, drug-like, and potential novel BthTX-I inhibitor. These calculations have been used to guide the design of novel phospholipase inhibitors as potential lead compounds that may be optimized for future treatment of snakebite victims as well as other human diseases in which PLA(2) enzymes are involved.
Resumo:
We present a fast method for finding optimal parameters for a low-resolution (threading) force field intended to distinguish correct from incorrect folds for a given protein sequence. In contrast to other methods, the parameterization uses information from >10(7) misfolded structures as well as a set of native sequence-structure pairs. In addition to testing the resulting force field's performance on the protein sequence threading problem, results are shown that characterize the number of parameters necessary for effective structure recognition.
Resumo:
We describe two ways of optimizing score functions for protein sequence to structure threading. The first method adjusts parameters to improve sequence to structure alignment. The second adjusts parameters so as to improve a score function's ability to rank alignments calculated in the first score function. Unlike those functions known as knowledge-based force fields, the resulting parameter sets do not rely on Boltzmann statistics, have no claim to representing free energies and are purely constructions for recognizing protein folds. The methods give a small improvement, but suggest that functions can be profitably optimized for very specific aspects of protein fold recognition, Proteins 1999;36:454-461. (C) 1999 Wiley-Liss, Inc.
Resumo:
Conventionally, protein structure prediction via threading relies on some nonoptimal method to align a protein sequence to each member of a library of known structures. We show how a score function (force field) can be modified so as to allow the direct application of a dynamic programming algorithm to the problem. This involves an approximation whose damage can be minimized by an optimization process during score function parameter determination. The method is compared to sequence to structure alignments using a more conventional pair-wise score function and the frozen approximation. The new method produces results comparable to the frozen approximation, but is faster and has fewer adjustable parameters. It is also free of memory of the template's original amino acid sequence, and does not suffer from a problem of nonconvergence, which can be shown to occur with the frozen approximation. Alignments generated by the simplified score function can then be ranked using a second score function with the approximations removed. (C) 1999 John Wiley & Sons, Inc.
Resumo:
Sausage is a protein sequence threading program, but with remarkable run-time flexibility. Using different scripts, it can calculate protein sequence-structure alignments, search structure libraries, swap force fields, create models form alignments, convert file formats and analyse results. There are several different force fields which might be classed as knowledge-based, although they do not rely on Boltzmann statistics. Different force fields are used for alignment calculations and subsequent ranking of calculated models.
Resumo:
Dissertação apresentada para obtenção de Grau de Doutor em Bioquímica,Bioquímica Estrutural, pela Universidade Nova de Lisboa, Faculdade de Ciências e Tecnologia
Resumo:
BACKGROUND: The availability of the P. falciparum genome has led to novel ways to identify potential vaccine candidates. A new approach for antigen discovery based on the bioinformatic selection of heptad repeat motifs corresponding to alpha-helical coiled coil structures yielded promising results. To elucidate the question about the relationship between the coiled coil motifs and their sequence conservation, we have assessed the extent of polymorphism in putative alpha-helical coiled coil domains in culture strains, in natural populations and in the single nucleotide polymorphism data available at PlasmoDB. METHODOLOGY/PRINCIPAL FINDINGS: 14 alpha-helical coiled coil domains were selected based on preclinical experimental evaluation. They were tested by PCR amplification and sequencing of different P. falciparum culture strains and field isolates. We found that only 3 out of 14 alpha-helical coiled coils showed point mutations and/or length polymorphisms. Based on promising immunological results 5 of these peptides were selected for further analysis. Direct sequencing of field samples from Papua New Guinea and Tanzania showed that 3 out of these 5 peptides were completely conserved. An in silico analysis of polymorphism was performed for all 166 putative alpha-helical coiled coil domains originally identified in the P. falciparum genome. We found that 82% (137/166) of these peptides were conserved, and for one peptide only the detected SNPs decreased substantially the probability score for alpha-helical coiled coil formation. More SNPs were found in arrays of almost perfect tandem repeats. In summary, the coiled coil structure prediction was rarely modified by SNPs. The analysis revealed a number of peptides with strictly conserved alpha-helical coiled coil motifs. CONCLUSION/SIGNIFICANCE: We conclude that the selection of alpha-helical coiled coil structural motifs is a valuable approach to identify potential vaccine targets showing a high degree of conservation.
Resumo:
In a number of programs for gene structure prediction in higher eukaryotic genomic sequences, exon prediction is decoupled from gene assembly: a large pool of candidate exons is predicted and scored from features located in the query DNA sequence, and candidate genes are assembled from such a pool as sequences of nonoverlapping frame-compatible exons. Genes are scored as a function of the scores of the assembled exons, and the highest scoring candidate gene is assumed to be the most likely gene encoded by the query DNA sequence. Considering additive gene scoring functions, currently available algorithms to determine such a highest scoring candidate gene run in time proportional to the square of the number of predicted exons. Here, we present an algorithm whose running time grows only linearly with the size of the set of predicted exons. Polynomial algorithms rely on the fact that, while scanning the set of predicted exons, the highest scoring gene ending in a given exon can be obtained by appending the exon to the highest scoring among the highest scoring genes ending at each compatible preceding exon. The algorithm here relies on the simple fact that such highest scoring gene can be stored and updated. This requires scanning the set of predicted exons simultaneously by increasing acceptor and donor position. On the other hand, the algorithm described here does not assume an underlying gene structure model. Indeed, the definition of valid gene structures is externally defined in the so-called Gene Model. The Gene Model specifies simply which gene features are allowed immediately upstream which other gene features in valid gene structures. This allows for great flexibility in formulating the gene identification problem. In particular it allows for multiple-gene two-strand predictions and for considering gene features other than coding exons (such as promoter elements) in valid gene structures.
Resumo:
Summary The specific CD8+ T cell immune response against tumors relies on the recognition by the T cell receptor (TCR) on cytotoxic T lymphocytes (CTL) of antigenic peptides bound to the class I major histocompatibility complex (MHC) molecule. Such tumor associated antigenic peptides are the focus of tumor immunotherapy with peptide vaccines. The strategy for obtaining an improved immune response often involves the design of modified tumor associated antigenic peptides. Such modifications aim at creating higher affinity and/or degradation resistant peptides and require precise structures of the peptide-MHC class I complex. In addition, the modified peptide must be cross-recognized by CTLs specific for the parental peptide, i.e. preserve the structure of the epitope. Detailed structural information on the modified peptide in complex with MHC is necessary for such predictions. In this thesis, the main focus is the development of theoretical in silico methods for prediction of both structure and cross-reactivity of peptide-MHC class I complexes. Applications of these methods in the context of immunotherapy are also presented. First, a theoretical method for structure prediction of peptide-MHC class I complexes is developed and validated. The approach is based on a molecular dynamics protocol to sample the conformational space of the peptide in its MHC environment. The sampled conformers are evaluated using conformational free energy calculations. The method, which is evaluated for its ability to reproduce 41 X-ray crystallographic structures of different peptide-MHC class I complexes, shows an overall prediction success of 83%. Importantly, in the clinically highly relevant subset of peptide-HLAA*0201 complexes, the prediction success is 100%. Based on these structure predictions, a theoretical approach for prediction of cross-reactivity is developed and validated. This method involves the generation of quantitative structure-activity relationships using three-dimensional molecular descriptors and a genetic neural network. The generated relationships are highly predictive as proved by high cross-validated correlation coefficients (0.78-0.79). Together, the here developed theoretical methods open the door for efficient rational design of improved peptides to be used in immunotherapy. Résumé La réponse immunitaire spécifique contre des tumeurs dépend de la reconnaissance par les récepteurs des cellules T CD8+ de peptides antigéniques présentés par les complexes majeurs d'histocompatibilité (CMH) de classe I. Ces peptides sont utilisés comme cible dans l'immunothérapie par vaccins peptidiques. Afin d'augmenter la réponse immunitaire, les peptides sont modifiés de façon à améliorer l'affinité et/ou la résistance à la dégradation. Ceci nécessite de connaître la structure tridimensionnelle des complexes peptide-CMH. De plus, les peptides modifiés doivent être reconnus par des cellules T spécifiques du peptide natif. La structure de l'épitope doit donc être préservée et des structures détaillées des complexes peptide-CMH sont nécessaires. Dans cette thèse, le thème central est le développement des méthodes computationnelles de prédiction des structures des complexes peptide-CMH classe I et de la reconnaissance croisée. Des applications de ces méthodes de prédiction à l'immunothérapie sont également présentées. Premièrement, une méthode théorique de prédiction des structures des complexes peptide-CMH classe I est développée et validée. Cette méthode est basée sur un échantillonnage de l'espace conformationnel du peptide dans le contexte du récepteur CMH classe I par dynamique moléculaire. Les conformations sont évaluées par leurs énergies libres conformationnelles. La méthode est validée par sa capacité à reproduire 41 structures des complexes peptide-CMH classe I obtenues par cristallographie aux rayons X. Le succès prédictif général est de 83%. Pour le sous-groupe HLA-A*0201 de complexes de grande importance pour l'immunothérapie, ce succès est de 100%. Deuxièmement, à partir de ces structures prédites in silico, une méthode théorique de prédiction de la reconnaissance croisée est développée et validée. Celle-ci consiste à générer des relations structure-activité quantitatives en utilisant des descripteurs moléculaires tridimensionnels et un réseau de neurones couplé à un algorithme génétique. Les relations générées montrent une capacité de prédiction remarquable avec des valeurs de coefficients de corrélation de validation croisée élevées (0.78-0.79). Les méthodes théoriques développées dans le cadre de cette thèse ouvrent la voie du design de vaccins peptidiques améliorés.
Resumo:
TCRep 3D is an automated systematic approach for TCR-peptide-MHC class I structure prediction, based on homology and ab initio modeling. It has been considerably generalized from former studies to be applicable to large repertoires of TCR. First, the location of the complementary determining regions of the target sequences are automatically identified by a sequence alignment strategy against a database of TCR Vα and Vβ chains. A structure-based alignment ensures automated identification of CDR3 loops. The CDR are then modeled in the environment of the complex, in an ab initio approach based on a simulated annealing protocol. During this step, dihedral restraints are applied to drive the CDR1 and CDR2 loops towards their canonical conformations, described by Al-Lazikani et. al. We developed a new automated algorithm that determines additional restraints to iteratively converge towards TCR conformations making frequent hydrogen bonds with the pMHC. We demonstrated that our approach outperforms popular scoring methods (Anolea, Dope and Modeller) in predicting relevant CDR conformations. Finally, this modeling approach has been successfully applied to experimentally determined sequences of TCR that recognize the NY-ESO-1 cancer testis antigen. This analysis revealed a mechanism of selection of TCR through the presence of a single conserved amino acid in all CDR3β sequences. The important structural modifications predicted in silico and the associated dramatic loss of experimental binding affinity upon mutation of this amino acid show the good correspondence between the predicted structures and their biological activities. To our knowledge, this is the first systematic approach that was developed for large TCR repertoire structural modeling.
Resumo:
Modeling methods to derive 3D-structure of proteins have been recently developed. Protein homology-modeling, also known as comparative protein modeling, is nowadays the most accurate protein modeling method. This technique can produce useful models for about an order of magnitude more protein sequences than there have been structures determined by experiment in the same amount of time. All current protein homology-modeling methods consist of four sequential steps: fold assignment and template selection, template-target alignment, model building, and model evaluation. In this paper we discuss in some detail the protein-homology paradigm, its predictive power and its limitations.
Resumo:
Affiliation: Claudia Kleinman, Nicolas Rodrigue & Hervé Philippe : Département de biochimie, Faculté de médecine, Université de Montréal
Resumo:
De récentes découvertes montrent le rôle important que joue l’acide ribonucléique (ARN) au sein des cellules, que ce soit le contrôle de l’expression génétique, la régulation de plusieurs processus homéostasiques, en plus de la transcription et la traduction de l’acide désoxyribonucléique (ADN) en protéine. Si l’on veut comprendre comment la cellule fonctionne, nous devons d’abords comprendre ses composantes et comment ils interagissent, et en particulier chez l’ARN. La fonction d’une molécule est tributaire de sa structure tridimensionnelle (3D). Or, déterminer expérimentalement la structure 3D d’un ARN s’avère fort coûteux. Les méthodes courantes de prédiction par ordinateur de la structure d’un ARN ne tiennent compte que des appariements classiques ou canoniques, similaires à ceux de la fameuse structure en double-hélice de l’ADN. Ici, nous avons amélioré la prédiction de structures d’ARN en tenant compte de tous les types possibles d’appariements, dont ceux dits non-canoniques. Cela est rendu possible dans le contexte d’un nouveau paradigme pour le repliement des ARN, basé sur les motifs cycliques de nucléotides ; des blocs de bases pour la construction des ARN. De plus, nous avons dévelopées de nouvelles métriques pour quantifier la précision des méthodes de prédiction des structures 3D des ARN, vue l’introduction récente de plusieurs de ces méthodes. Enfin, nous avons évalué le pouvoir prédictif des nouvelles techniques de sondage de basse résolution des structures d’ARN.