972 resultados para Loop structure prediction
Resumo:
We present a fast method for finding optimal parameters for a low-resolution (threading) force field intended to distinguish correct from incorrect folds for a given protein sequence. In contrast to other methods, the parameterization uses information from >10(7) misfolded structures as well as a set of native sequence-structure pairs. In addition to testing the resulting force field's performance on the protein sequence threading problem, results are shown that characterize the number of parameters necessary for effective structure recognition.
Resumo:
We describe two ways of optimizing score functions for protein sequence to structure threading. The first method adjusts parameters to improve sequence to structure alignment. The second adjusts parameters so as to improve a score function's ability to rank alignments calculated in the first score function. Unlike those functions known as knowledge-based force fields, the resulting parameter sets do not rely on Boltzmann statistics, have no claim to representing free energies and are purely constructions for recognizing protein folds. The methods give a small improvement, but suggest that functions can be profitably optimized for very specific aspects of protein fold recognition, Proteins 1999;36:454-461. (C) 1999 Wiley-Liss, Inc.
Resumo:
Conventionally, protein structure prediction via threading relies on some nonoptimal method to align a protein sequence to each member of a library of known structures. We show how a score function (force field) can be modified so as to allow the direct application of a dynamic programming algorithm to the problem. This involves an approximation whose damage can be minimized by an optimization process during score function parameter determination. The method is compared to sequence to structure alignments using a more conventional pair-wise score function and the frozen approximation. The new method produces results comparable to the frozen approximation, but is faster and has fewer adjustable parameters. It is also free of memory of the template's original amino acid sequence, and does not suffer from a problem of nonconvergence, which can be shown to occur with the frozen approximation. Alignments generated by the simplified score function can then be ranked using a second score function with the approximations removed. (C) 1999 John Wiley & Sons, Inc.
Resumo:
Sausage is a protein sequence threading program, but with remarkable run-time flexibility. Using different scripts, it can calculate protein sequence-structure alignments, search structure libraries, swap force fields, create models form alignments, convert file formats and analyse results. There are several different force fields which might be classed as knowledge-based, although they do not rely on Boltzmann statistics. Different force fields are used for alignment calculations and subsequent ranking of calculated models.
Resumo:
A precise, reproducible deletion made during in vitro reverse transcription of RNA2 from the icosahedral positive-stranded Helicoverpa armigera stunt virus (Tetraviridae) is described. The deletion, located between two hexamer repeats, is a 50-base sequence that includes one copy of the hexamer repeat. Only the Moloney murine leukemia virus reverse transcriptase and its derivative Superscript I, carrying a deletion of the carboxy-terminal RNase H region, showed this response, indicating a template-switching mechanism different from one proposed that involves a RNase H-dependent strand transfer, Superscript II, however, which carries point mutations to reduce RNase H activity, does not cause a deletion. A possible mechanism involves the enzyme pausing at the 3' side of a stem-loop structure and the 3' end of the nascent DNA strand separating from the template and reannealing to the upstream hexamer repeat.
Resumo:
Dissertação apresentada para obtenção de Grau de Doutor em Bioquímica,Bioquímica Estrutural, pela Universidade Nova de Lisboa, Faculdade de Ciências e Tecnologia
Resumo:
BACKGROUND: The availability of the P. falciparum genome has led to novel ways to identify potential vaccine candidates. A new approach for antigen discovery based on the bioinformatic selection of heptad repeat motifs corresponding to alpha-helical coiled coil structures yielded promising results. To elucidate the question about the relationship between the coiled coil motifs and their sequence conservation, we have assessed the extent of polymorphism in putative alpha-helical coiled coil domains in culture strains, in natural populations and in the single nucleotide polymorphism data available at PlasmoDB. METHODOLOGY/PRINCIPAL FINDINGS: 14 alpha-helical coiled coil domains were selected based on preclinical experimental evaluation. They were tested by PCR amplification and sequencing of different P. falciparum culture strains and field isolates. We found that only 3 out of 14 alpha-helical coiled coils showed point mutations and/or length polymorphisms. Based on promising immunological results 5 of these peptides were selected for further analysis. Direct sequencing of field samples from Papua New Guinea and Tanzania showed that 3 out of these 5 peptides were completely conserved. An in silico analysis of polymorphism was performed for all 166 putative alpha-helical coiled coil domains originally identified in the P. falciparum genome. We found that 82% (137/166) of these peptides were conserved, and for one peptide only the detected SNPs decreased substantially the probability score for alpha-helical coiled coil formation. More SNPs were found in arrays of almost perfect tandem repeats. In summary, the coiled coil structure prediction was rarely modified by SNPs. The analysis revealed a number of peptides with strictly conserved alpha-helical coiled coil motifs. CONCLUSION/SIGNIFICANCE: We conclude that the selection of alpha-helical coiled coil structural motifs is a valuable approach to identify potential vaccine targets showing a high degree of conservation.
Resumo:
In a number of programs for gene structure prediction in higher eukaryotic genomic sequences, exon prediction is decoupled from gene assembly: a large pool of candidate exons is predicted and scored from features located in the query DNA sequence, and candidate genes are assembled from such a pool as sequences of nonoverlapping frame-compatible exons. Genes are scored as a function of the scores of the assembled exons, and the highest scoring candidate gene is assumed to be the most likely gene encoded by the query DNA sequence. Considering additive gene scoring functions, currently available algorithms to determine such a highest scoring candidate gene run in time proportional to the square of the number of predicted exons. Here, we present an algorithm whose running time grows only linearly with the size of the set of predicted exons. Polynomial algorithms rely on the fact that, while scanning the set of predicted exons, the highest scoring gene ending in a given exon can be obtained by appending the exon to the highest scoring among the highest scoring genes ending at each compatible preceding exon. The algorithm here relies on the simple fact that such highest scoring gene can be stored and updated. This requires scanning the set of predicted exons simultaneously by increasing acceptor and donor position. On the other hand, the algorithm described here does not assume an underlying gene structure model. Indeed, the definition of valid gene structures is externally defined in the so-called Gene Model. The Gene Model specifies simply which gene features are allowed immediately upstream which other gene features in valid gene structures. This allows for great flexibility in formulating the gene identification problem. In particular it allows for multiple-gene two-strand predictions and for considering gene features other than coding exons (such as promoter elements) in valid gene structures.
Resumo:
In the Gac/Rsm signal transduction pathway of Pseudomonas fluorescens CHA0, the dimeric RNA-binding proteins RsmA and RsmE, which belong to the vast bacterial RsmA/CsrA family, effectively repress translation of target mRNAs containing a typical recognition sequence near the translation start site. Three small RNAs (RsmX, RsmY, RsmZ) with clustered recognition sequences can sequester RsmA and RsmE and thereby relieve translational repression. According to a previously established structural model, the RsmE protein makes optimal contacts with an RNA sequence 5'- (A)/(U)CANGGANG(U)/(A)-3', in which the central ribonucleotides form a hexaloop. Here, we questioned the relevance of the hexaloop structure in target RNAs. We found that two predicted pentaloop structures, AGGGA (in pltA mRNA encoding a pyoluteorin biosynthetic enzyme) and AAGGA (in mutated pltA mRNA), allowed effective interaction with the RsmE protein in vivo. By contrast, ACGGA and AUGGA were poor targets. Isothermal titration calorimetry measurements confirmed the strong binding of RsmE to the AGGGA pentaloop structure in an RNA oligomer. Modeling studies highlighted the crucial role of the second ribonucleotide in the loop structure. In conclusion, a refined structural model of RsmE-RNA interaction accommodates certain pentaloop RNAs among the preferred hexaloop RNAs.
Resumo:
Summary The specific CD8+ T cell immune response against tumors relies on the recognition by the T cell receptor (TCR) on cytotoxic T lymphocytes (CTL) of antigenic peptides bound to the class I major histocompatibility complex (MHC) molecule. Such tumor associated antigenic peptides are the focus of tumor immunotherapy with peptide vaccines. The strategy for obtaining an improved immune response often involves the design of modified tumor associated antigenic peptides. Such modifications aim at creating higher affinity and/or degradation resistant peptides and require precise structures of the peptide-MHC class I complex. In addition, the modified peptide must be cross-recognized by CTLs specific for the parental peptide, i.e. preserve the structure of the epitope. Detailed structural information on the modified peptide in complex with MHC is necessary for such predictions. In this thesis, the main focus is the development of theoretical in silico methods for prediction of both structure and cross-reactivity of peptide-MHC class I complexes. Applications of these methods in the context of immunotherapy are also presented. First, a theoretical method for structure prediction of peptide-MHC class I complexes is developed and validated. The approach is based on a molecular dynamics protocol to sample the conformational space of the peptide in its MHC environment. The sampled conformers are evaluated using conformational free energy calculations. The method, which is evaluated for its ability to reproduce 41 X-ray crystallographic structures of different peptide-MHC class I complexes, shows an overall prediction success of 83%. Importantly, in the clinically highly relevant subset of peptide-HLAA*0201 complexes, the prediction success is 100%. Based on these structure predictions, a theoretical approach for prediction of cross-reactivity is developed and validated. This method involves the generation of quantitative structure-activity relationships using three-dimensional molecular descriptors and a genetic neural network. The generated relationships are highly predictive as proved by high cross-validated correlation coefficients (0.78-0.79). Together, the here developed theoretical methods open the door for efficient rational design of improved peptides to be used in immunotherapy. Résumé La réponse immunitaire spécifique contre des tumeurs dépend de la reconnaissance par les récepteurs des cellules T CD8+ de peptides antigéniques présentés par les complexes majeurs d'histocompatibilité (CMH) de classe I. Ces peptides sont utilisés comme cible dans l'immunothérapie par vaccins peptidiques. Afin d'augmenter la réponse immunitaire, les peptides sont modifiés de façon à améliorer l'affinité et/ou la résistance à la dégradation. Ceci nécessite de connaître la structure tridimensionnelle des complexes peptide-CMH. De plus, les peptides modifiés doivent être reconnus par des cellules T spécifiques du peptide natif. La structure de l'épitope doit donc être préservée et des structures détaillées des complexes peptide-CMH sont nécessaires. Dans cette thèse, le thème central est le développement des méthodes computationnelles de prédiction des structures des complexes peptide-CMH classe I et de la reconnaissance croisée. Des applications de ces méthodes de prédiction à l'immunothérapie sont également présentées. Premièrement, une méthode théorique de prédiction des structures des complexes peptide-CMH classe I est développée et validée. Cette méthode est basée sur un échantillonnage de l'espace conformationnel du peptide dans le contexte du récepteur CMH classe I par dynamique moléculaire. Les conformations sont évaluées par leurs énergies libres conformationnelles. La méthode est validée par sa capacité à reproduire 41 structures des complexes peptide-CMH classe I obtenues par cristallographie aux rayons X. Le succès prédictif général est de 83%. Pour le sous-groupe HLA-A*0201 de complexes de grande importance pour l'immunothérapie, ce succès est de 100%. Deuxièmement, à partir de ces structures prédites in silico, une méthode théorique de prédiction de la reconnaissance croisée est développée et validée. Celle-ci consiste à générer des relations structure-activité quantitatives en utilisant des descripteurs moléculaires tridimensionnels et un réseau de neurones couplé à un algorithme génétique. Les relations générées montrent une capacité de prédiction remarquable avec des valeurs de coefficients de corrélation de validation croisée élevées (0.78-0.79). Les méthodes théoriques développées dans le cadre de cette thèse ouvrent la voie du design de vaccins peptidiques améliorés.
Resumo:
The amiloride-sensitive epithelial Na channel (ENaC) is a heteromultimeric channel made of three alpha beta gamma subunits. The structures involved in the ion permeation pathway have only been partially identified, and the respective contributions of each subunit in the formation of the conduction pore has not yet been established. Using a site-directed mutagenesis approach, we have identified in a short segment preceding the second membrane-spanning domain (the pre-M2 segment) amino acid residues involved in ion permeation and critical for channel block by amiloride. Cys substitutions of Gly residues in beta and gamma subunits at position beta G525 and gamma G537 increased the apparent inhibitory constant (Ki) for amiloride by > 1,000-fold and decreased channel unitary current without affecting ion selectivity. The corresponding mutation S583 to C in the alpha subunit increased amiloride Ki by 20-fold, without changing channel conducting properties. Coexpression of these mutated alpha beta gamma subunits resulted in a non-conducting channel expressed at the cell surface. Finally, these Cys substitutions increased channel affinity for block by external Zn2+ ions, in particular the alpha S583C mutant showing a Ki for Zn2+ of 29 microM. Mutations of residues alpha W582L, or beta G522D also increased amiloride Ki, the later mutation generating a Ca2+ blocking site located 15% within the membrane electric field. These experiments provide strong evidence that alpha beta gamma ENaCs are pore-forming subunits involved in ion permeation through the channel. The pre-M2 segment of alpha beta gamma subunits may form a pore loop structure at the extracellular face of the channel, where amiloride binds within the channel lumen. We propose that amiloride interacts with Na+ ions at an external Na+ binding site preventing ion permeation through the channel pore.
Resumo:
TCRep 3D is an automated systematic approach for TCR-peptide-MHC class I structure prediction, based on homology and ab initio modeling. It has been considerably generalized from former studies to be applicable to large repertoires of TCR. First, the location of the complementary determining regions of the target sequences are automatically identified by a sequence alignment strategy against a database of TCR Vα and Vβ chains. A structure-based alignment ensures automated identification of CDR3 loops. The CDR are then modeled in the environment of the complex, in an ab initio approach based on a simulated annealing protocol. During this step, dihedral restraints are applied to drive the CDR1 and CDR2 loops towards their canonical conformations, described by Al-Lazikani et. al. We developed a new automated algorithm that determines additional restraints to iteratively converge towards TCR conformations making frequent hydrogen bonds with the pMHC. We demonstrated that our approach outperforms popular scoring methods (Anolea, Dope and Modeller) in predicting relevant CDR conformations. Finally, this modeling approach has been successfully applied to experimentally determined sequences of TCR that recognize the NY-ESO-1 cancer testis antigen. This analysis revealed a mechanism of selection of TCR through the presence of a single conserved amino acid in all CDR3β sequences. The important structural modifications predicted in silico and the associated dramatic loss of experimental binding affinity upon mutation of this amino acid show the good correspondence between the predicted structures and their biological activities. To our knowledge, this is the first systematic approach that was developed for large TCR repertoire structural modeling.
Resumo:
Abstract: The AU-rich elements (AREs) consisting of repeated AUUUA motifs confer rapid degradation to many cellular mRNAs when present in the 3' untranslated region (3'UTR). We have studied the instability of interleukin-6 mRNA by grafting its 3' untranslated region to a stable green fluorescent protein mRNA. Subsequent scanning mutagenesis identified two conserved elements, which taken together account for most of the instability. The first corresponds to a short non-canonical AU-rich element. The other comprises a sequence predicted to form astern-loop structure. Both elements need to be present in order to confer full instability (Paschoud et al. 2006). Destabilization of ARE-containing mRNAs is thought to involve ARE-binding proteins such as AUF1. We tested whether AUF1 binding to interleukin-6 mRNA correlates with decreased mRNA stability. Overexpression of myc-tagged p37AUFl and p42AUF1 as well as suppression of all four AUF1 isoforms by RNA interference stabilized the interleukin-6 mRNA. Furthermore, the interleukin-6 mRNA co-immunoprecipitated specifically with myc-tagged p37AUF1 and p42AUF1 in cell extracts. Both the stabilization and AUF1-binding required the non-canonical AU-rich sequence. These results indicate that AUF1 binds to the AU-rich element in vivo and promotes interleukin6 mRNA degradation. The combination of mRNA co-immunoprecipitation with microarray technology revealed that at least 500 cellular mRNAs associate with AUF1. Résumé: "La présence d'éléments riches en A et U (ARE), en particulier les motifs répétés d'AUUUA dans la région 3' non traduite, confère une dégradation rapide à beaucoup d'ARN cellulaires. Nous avons étudié l'instabilité de l'ARN codant pour l'interleukine 6 en greffant sa région 3' non traduite à un ARN stable codant pour la protéine fluorescente verte. La mutagenèse systématique des séquences non traduites a permis l'identification de deux éléments conservés qui confèrent l'instabilité à l'ARN. Le premier correspond à un élément AU-riche non canonique court. Le second comporte une structure en 'épingle à cheveux'. Tous les deux éléments doivent être présents afin de conférer une instabilité complète (Paschoud et al. 2006). On pense que des protéines telles que AUF1, pouvant se lier aux éléments ARE, sont impliquées dans la dégradation des ARN messagers. Nous avons examiné si la liaison de AUFl sur l'ARN de l'interleukine 6 corrèle avec une stabilité diminuée. La surexpression des protéines p37AUF1 et de p42AUF1 myc-étiquetées ainsi que la suppression de chacun des quatre isoformes de AUF1 par interférence d'ARN a stabilisé l'ARN messager d'interleukine 6. En outre, cet ARN co-immunoprécipite spécifiquement avec p37AUF1 et p42AUF1 dans des extraits cellulaires. La présence de l'élément AUriche non canonique est nécessaire pour la stabilisation de l'ARN et sa liaison avec AUFI. Ces résultats indiquent qu'AUF1 se lie à l'élément AU-riche in vivo et favorise la dégradation de l'ARN messager d'interleukine 6. La combinaison des techniques de coimmunoprécipitation des ARN messagers et des analyses par `microarray' indique qu'au moins 500 ARN cellulaires s'associent à AUF1.
Resumo:
We describe the use of dynamic combinatorial chemistry (DCC) to identify ligands for the stem-loop structure located at the exon 10-5'-intron junction of Tau pre-mRNA, which is involved in the onset of several tauopathies including frontotemporal dementia with Parkinsonism linked to chromosome 17 (FTDP-17). A series of ligands that combine the small aminoglycoside neamine and heteroaromatic moieties (azaquinolone and two acridines) have been identified by using DCC. These compounds effectively bind the stem-loop RNA target (the concentration required for 50% RNA response (EC(50)): 2-58 μM), as determined by fluorescence titration experiments. Importantly, most of them are able to stabilize both the wild-type and the +3 and +14 mutated sequences associated with the development of FTDP-17 without producing a significant change in the overall structure of the RNA (as analyzed by circular dichroism (CD) spectroscopy), which is a key factor for recognition by the splicing regulatory machinery. A good correlation has been found between the affinity of the ligands for the target and their ability to stabilize the RNA secondary structure.
Resumo:
The complete mitochondrial DNA (mtDNA) control region was amplified and directly sequenced in two species of shrew, Crocidura russula and Sorex araneus (Insectivora, Mammalia). The general organization is similar to that found in other mammals: a central conserved region surrounded by two more variable domains. However, we have found in shrews the simultaneous presence of arrays of tandem repeats in potential locations where repeats tend to occur separately in other mammalian species. These locations correspond to regions which are associated with a possible interruption of the replication processes, either at the end of the three-stranded D-loop structure or toward the end of the heavy-strand replication. In the left domain the repeated sequences (R1 repeats) are 78 bp long, whereas in the right domain the repeats are 12 bp long in C. russula and 14 bp long in S. araneus (R2 repeats). Variation in the copy number of these repeated sequences results in mtDNA control region length differences. Southern blot analysis indicates that level of heteroplasmy (more than one mtDNA form within an individual) differs between species. A comparative study of the R2 repeats in 12 additional species representing three shrew subfamilies provides useful indications for the understanding of the origin and the evolution of these homologous tandemly repeated sequences. An asymmetry in the distribution of variants within the arrays, as well as the constant occurrence of shorter repeated sequences flanking only one side of the R2 arrays, could be related to asymmetry in the replication of each strand of the mtDNA molecule. The pattern of sequence and length variation within and between species, together with the capability of the arrays to form stable secondary structures, suggests that the dominant mechanism involved in the evolution of these arrays in unidirectional replication slippage.