974 resultados para structure prediction
Resumo:
La flexibilité est une caractéristique intrinsèque des protéines qui doivent, dès le mo- ment de leur synthèse, passer d’un état de chaîne linéaire à un état de structure tridimen- sionnelle repliée et enzymatiquement active. Certaines protéines restent flexibles une fois repliées et subissent des changements de conformation de grande amplitude lors de leur cycle enzymatique. D’autres contiennent des segments si flexibles que leur structure ne peut être résolue par des méthodes expérimentales. Dans cette thèse, nous présentons notre application de méthodes in silico d’analyse de la flexibilité des protéines : • À l’aide des méthodes de dynamique moléculaire dirigée et d’échantillonnage pa- rapluie, nous avons caractérisé les trajectoires de liaison de l’inhibiteur Z-pro- prolinal à la protéine Prolyl oligopeptidase et identifié la trajectoire la plus pro- bable. Nos simulations ont aussi identifié un mode probable de recrutement des ligands utilisant une boucle flexible de 19 acides aminés à l’interface des deux domaines de la protéine. • En utilisant les méthodes de dynamique moléculaire traditionnelle et dirigée, nous avons examiné la stabilité de la protéine SAV1866 dans sa forme fermée insérée dans une membrane lipidique et étudié un des modes d’ouverture possibles par la séparation de ses domaines liant le nucléotide. • Nous avons adapté auproblème de la prédiction de la structure des longues boucles flexibles la méthode d’activation et de relaxation ART-nouveau précédemment uti- lisée dans l’étude du repliement et de l’agrégation de protéines. Appliqué au replie- ment de boucles de 8 à 20 acides aminés, la méthode démontre une dépendance quadratique du temps d’exécution sur la longueur des boucles, rendant possible l’étude de boucles encore plus longues.
Resumo:
Des évidences expérimentales récentes indiquent que les ARN changent de structures au fil du temps, parfois très rapidement, et que ces changements sont nécessaires à leurs activités biochimiques. La structure de ces ARN est donc dynamique. Ces mêmes évidences notent également que les structures clés impliquées sont prédites par le logiciel de prédiction de structure secondaire MC-Fold. En comparant les prédictions de structures du logiciel MC-Fold, nous avons constaté un lien clair entre les structures presque optimales (en termes de stabilité prédites par ce logiciel) et les variations d’activités biochimiques conséquentes à des changements ponctuels dans la séquence. Nous avons comparé les séquences d’ARN du point de vue de leurs structures dynamiques afin d’investiguer la similarité de leurs fonctions biologiques. Ceci a nécessité une accélération notable du logiciel MC-Fold. L’approche algorithmique est décrite au chapitre 1. Au chapitre 2 nous classons les impacts de légères variations de séquences des microARN sur la fonction naturelle de ceux-ci. Au chapitre 3 nous identifions des fenêtres dans de longs ARN dont les structures dynamiques occupent possiblement des rôles dans les désordres du spectre autistique et dans la polarisation des œufs de certains batraciens (Xenopus spp.).
Resumo:
Les protéines sont au coeur de la vie. Ce sont d'incroyables nanomachines moléculaires spécialisées et améliorées par des millions d'années d'évolution pour des fonctions bien définies dans la cellule. La structure des protéines, c'est-à-dire l'arrangement tridimensionnel de leurs atomes, est intimement liée à leurs fonctions. L'absence apparente de structure pour certaines protéines est aussi de plus en plus reconnue comme étant tout aussi cruciale. Les protéines amyloïdes en sont un exemple marquant : elles adoptent un ensemble de structures variées difficilement observables expérimentalement qui sont associées à des maladies neurodégénératives. Cette thèse, dans un premier temps, porte sur l'étude structurelle des protéines amyloïdes bêta-amyloïde (Alzheimer) et huntingtine (Huntington) lors de leur processus de repliement et d'auto-assemblage. Les résultats obtenus permettent de décrire avec une résolution atomique les interactions des ensembles structurels de ces deux protéines. Concernant la protéine bêta-amyloïde (AB), nos résultats identifient des différences structurelles significatives entre trois de ses formes physiologiques durant ses premières étapes d'auto-assemblage en environnement aqueux. Nous avons ensuite comparé ces résultats avec ceux obtenus au cours des dernières années par d'autres groupes de recherche avec des protocoles expérimentaux et de simulations variés. Des tendances claires émergent de notre comparaison quant à l'influence de la forme physiologique de AB sur son ensemble structurel durant ses premières étapes d'auto-assemblage. L'identification des propriétés structurelles différentes rationalise l'origine de leurs propriétés d'agrégation distinctes. Par ailleurs, l'identification des propriétés structurelles communes offrent des cibles potentielles pour des agents thérapeutiques empêchant la formation des oligomères responsables de la neurotoxicité. Concernant la protéine huntingtine, nous avons élucidé l'ensemble structurel de sa région fonctionnelle située à son N-terminal en environnement aqueux et membranaire. En accord avec les données expérimentales disponibles, nos résultats sur son repliement en environnement aqueux révèlent les interactions dominantes ainsi que l'influence sur celles-ci des régions adjacentes à la région fonctionnelle. Nous avons aussi caractérisé la stabilité et la croissance de structures nanotubulaires qui sont des candidats potentiels aux chemins d'auto-assemblage de la région amyloïde de huntingtine. Par ailleurs, nous avons également élaboré, avec un groupe d'expérimentateurs, un modèle détaillé illustrant les principales interactions responsables du rôle d'ancre membranaire de la région N-terminal, qui sert à contrôler la localisation de huntingtine dans la cellule. Dans un deuxième temps, cette thèse porte sur le raffinement d'un modèle gros-grain (sOPEP) et sur le développement d'un nouveau modèle tout-atome (aaOPEP) qui sont tous deux basés sur le champ de force gros-grain OPEP, couramment utilisé pour l'étude du repliement des protéines et de l'agrégation des protéines amyloïdes. L'optimisation de ces modèles a été effectuée dans le but d'améliorer les prédictions de novo de la structure de peptides par la méthode PEP-FOLD. Par ailleurs, les modèles OPEP, sOPEP et aaOPEP ont été inclus dans un nouveau code de dynamique moléculaire très flexible afin de grandement simplifier leurs développements futurs.
Characterization and Pathogenicity of Vibrio cholerae and Vibrio vulnificus from Marine environments
Resumo:
The genus Vibrioof the family Vibrionaceae are Gram negative, oxidasepositive, rod- or curved- rodshaped facultative anaerobes, widespread in marine and estuarine environments. Vibrio species are opportunistic human pathogens responsible for diarrhoeal disease, gastroenteritis, septicaemia and wound infections and are also pathogens of aquatic organisms, causing infections to crustaceans, bivalves and fishes. In the present study, marine environmental samples like seafood and water and sediment samples from aquafarms and mangroves were screened for the presence of Vibrio species. Of the134 isolates obtained from the various samples, 45 were segregated to the genus Vibrio on the basis of phenotypic characterization.like Gram staining, oxidase test, MoF test and salinity tolerance. Partial 16S rDNA sequence analysis was utilized for species level identification of the isolates and the strains were identified as V. cholerae(N=21), V. vulnificus(N=18), V. parahaemolyticus(N=3), V. alginolyticus (N=2) and V. azureus (N=1). The genetic relatedness and variations among the 45 Vibrio isolates were elucidated based on 16S rDNA sequences. Phenotypic characterization of the isolates was based on their response to 12 biochemical tests namely Voges-Proskauers’s (VP test), arginine dihydrolase , tolerance to 3% NaCl test, ONPG test that detects β-galactosidase activity, and tests for utilization of citrate, ornithine, mannitol, arabinose, sucrose, glucose, salicin and cellobiose. The isolates exhibited diverse biochemical patterns, some specific for the species and others indicative of their environmental source.Antibiogram for the isolates was determined subsequent to testing their susceptibility to 12 antibiotics by the disc diffusion method. Varying degrees of resistance to gentamycin (2.22%), ampicillin(62.22%), nalidixic acid (4.44%), vancomycin (86.66), cefixime (17.77%), rifampicin (20%), tetracycline (42.22%) and chloramphenicol (2.22%) was exhibited. All the isolates were susceptible to streptomycin, co-trimoxazole, trimethoprim and azithromycin. Isolates from all the three marine environments exhibited multiple antibiotic resistance, with high MAR index value. The molecular typing methods such as ERIC PCR and BOX PCR revealed intraspecies relatedness and genetic heterogeneity within the environmental isolatesof V. cholerae and V. vulnificus. The 21 strains of V. choleraewere serogroupedas non O1/ non O139 by screening for the presence O1rfb and O139 rfb marker genes by PCR. The virulence/virulence associated genes namely ctxA, ctxB, ace, VPI, hlyA, ompU, rtxA, toxR, zot, nagst, tcpA, nin and nanwere screened in V. cholerae and V. vulnificusstrains.The V. vulnificusstrains were also screened for three species specific genes viz., cps, vvhand viu. In V. cholerae strains, the virulence associated genes like VPI, hlyA, rtxA, ompU and toxR were confirmed by PCR. All the isolates, except for strain BTOS6, harbored at least one or a combination of the tested genes and V. choleraestrain BTPR5 isolated from prawn hosted the highest number of virulence associated genes. Among the V. vulnificusstrains, only 3 virulence genes, VPI, toxR and cps, were confirmed out of the 16 tested and only 7 of the isolates had these genes in one or more combinations. Strain BTPS6 from aquafarm and strain BTVE4 from mangrove samples yielded positive amplification for the three genes. The toxRgene from 9 strains of V. choleraeand 3 strains of V. vulnificus were cloned and sequenced for phylogenetic analysis based on nucleotide and the amino acid sequences. Multiple sequence alignment of the nucleotide sequences and amino acid sequences of the environmental strains of V. choleraerevealed that the toxRgene in the environmental strains are 100% homologous to themselves and to the V. choleraetoxR gene sequence available in the Genbank database. The 3 strains of V. vulnificus displayed high nucleotide and amino acid sequence similarity among themselves and to the sequences of V. cholerae and V. harveyi obtained from the GenBank database, but exhibited only 72% homology to the sequences of its close relative V. vulnificus. Structure prediction of the ToxR protein of Vibrio cholerae strain BTMA5 was by PHYRE2 software. The deduced amino acid sequence showed maximum resemblance with the structure of DNA-binding domain of response regulator2 from Escherichia coli k-12 Template based homology modelling in PHYRE2 successfully modelled the predicted protein and its secondary structure based on protein data bank (PDB) template c3zq7A. The pathogenicity studies were performed using the nematode Caenorhabditiselegansas a model system. The assessment of pathogenicity of environmental strain of V. choleraewas conducted with E. coli strain OP50 as the food source in control plates, environmental V. cholerae strain BTOS6, negative for all tested virulence genes, to check for the suitability of Vibrio sp. as a food source for the nematode;V. cholerae Co 366 ElTor, a clinical pathogenic strain and V. cholerae strain BTPR5 from seafood (Prawn) and positive for the tested virulence genes like VPI, hlyA, ompU,rtxA and toxR. It was found that V. cholerae strain BTOS6 could serve as a food source in place of E. coli strain OP50 but behavioral aberrations like sluggish movement and lawn avoidance and morphological abnormalities like pharyngeal and intestinal distensions and bagging were exhibited by the worms fed on V. cholerae Co 366 ElTor strain and environmental BTPR5 indicating their pathogenicity to the nematode. Assessment of pathogenicity of the environmental strains of V. vulnificus was performed with V. vulnificus strain BTPS6 which tested positive for 3 virulence genes, namely, cps, toxRand VPI, and V. vulnificus strain BTMM7 that did not possess any of the tested virulence genes. A reduction was observed in the life span of worms fed on environmental strain of V. vulnificusBTMM7 rather than on the ordinary laboratory food source, E. coli OP50. Behavioral abnormalities like sluggish movement, lawn avoidance and bagging were also observed in the worms fed with strain BTPS6, but the pharynx and the intestine were intact. The presence of multi drug resistant environmental Vibrio strainsthat constitute a major reservoir of diverse virulence genes are to be dealt with caution as they play a decisive role in pathogenicity and horizontal gene transfer in the marine environments.
Resumo:
The resurgence of the enteric pathogen Vibrio cholerae, the causative organism of epidemic cholera, remains a major health problem in many developing countries like India. The southern Indian state of Kerala is endemic to cholera. The outbreaks of cholera follow a seasonal pattern in regions of endemicity. Marine aquaculture settings and mangrove environments of Kerala serve as reservoirs for V. cholerae. The non-O1/non-O139 environmental isolates of V. cholerae with incomplete ‘virulence casette’ are to be dealt with caution as they constitute a major reservoir of diverse virulence genes in the marine environment and play a crucial role in pathogenicity and horizontal gene transfer. The genes coding cholera toxin are borne on, and can be infectiously transmitted by CTXΦ, a filamentous lysogenic vibriophages. Temperate phages can provide crucial virulence and fitness factors affecting cell metabolism, bacterial adhesion, colonization, immunity, antibiotic resistance and serum resistance. The present study was an attempt to screen the marine environments like aquafarms and mangroves of coastal areas of Alappuzha and Cochin, Kerala for the presence of lysogenic V. cholerae, to study their pathogenicity and also gene transfer potential. Phenotypic and molecular methods were used for identification of isolates as V. cholerae. The thirty one isolates which were Gram negative, oxidase positive, fermentative, with or without gas production on MOF media and which showed yellow coloured colonies on TCBS (Thiosulfate Citrate Bile salt Sucrose) agar were segregated as vibrios. Twenty two environmental V. cholerae strains of both O1 and non- O1/non-O139 serogroups on induction with mitomycin C showed the presence of lysogenic phages. They produced characteristic turbid plaques in double agar overlay assay using the indicator strain V. cholerae El Tor MAK 757. PCR based molecular typing with primers targeting specific conserved sequences in the bacterial genome, demonstrated genetic diversity among these lysogen containing non-O1 V. cholerae . Polymerase chain reaction was also employed as a rapid screening method to verify the presence of 9 virulence genes namely, ctxA, ctxB, ace, hlyA, toxR, zot,tcpA, ninT and nanH, using gene specific primers. The presence of tcpA gene in ALPVC3 was alarming, as it indicates the possibility of an epidemic by accepting the cholera. Differential induction studies used ΦALPVC3, ΦALPVC11, ΦALPVC12 and ΦEKM14, underlining the possibility of prophage induction in natural ecosystems, due to abiotic factors like antibiotics, pollutants, temperature and UV. The efficiency of induction of prophages varied considerably in response to the different induction agents. The growth curve of lysogenic V. cholerae used in the study drastically varied in the presence of strong prophage inducers like antibiotics and UV. Bacterial cell lysis was directly proportional to increase in phage number due to induction. Morphological characterization of vibriophages by Transmission Electron Microscopy revealed hexagonal heads for all the four phages. Vibriophage ΦALPVC3 exhibited isometric and contractile tails characteristic of family Myoviridae, while phages ΦALPVC11 and ΦALPVC12 demonstrated the typical hexagonal head and non-contractile tail of family Siphoviridae. ΦEKM14, the podophage was distinguished by short non-contractile tail and icosahedral head. This work demonstrated that environmental parameters can influence the viability and cell adsorption rates of V. cholerae phages. Adsorption studies showed 100% adsorption of ΦALPVC3 ΦALPVC11, ΦALPVC12 and ΦEKM14 after 25, 30, 40 and 35 minutes respectively. Exposure to high temperatures ranging from 50ºC to 100ºC drastically reduced phage viability. The optimum concentration of NaCl required for survival of vibriophages except ΦEKM14 was 0.5 M and that for ΦEKM14 was 1M NaCl. Survival of phage particles was maximum at pH 7-8. V. cholerae is assumed to have existed long before their human host and so the pathogenic clones may have evolved from aquatic forms which later colonized the human intestine by progressive acquisition of genes. This is supported by the fact that the vast majority of V. cholerae strains are still part of the natural aquatic environment. CTXΦ has played a critical role in the evolution of the pathogenicity of V. cholerae as it can transmit the ctxAB gene. The unusual transformation of V. cholerae strains associated with epidemics and the emergence of V. cholera O139 demonstrates the evolutionary success of the organism in attaining greater fitness. Genetic changes in pathogenic V. cholerae constitute a natural process for developing immunity within an endemically infected population. The alternative hosts and lysogenic environmental V. cholerae strains may potentially act as cofactors in promoting cholera phage ‘‘blooms’’ within aquatic environments, thereby influencing transmission of phage sensitive, pathogenic V. cholerae strains by aquatic vehicles. Differential induction of the phages is a clear indication of the impact of environmental pollution and global changes on phage induction. The development of molecular biology techniques offered an accessible gateway for investigating the molecular events leading to genetic diversity in the marine environment. Using nucleic acids as targets, the methods of fingerprinting like ERIC PCR and BOX PCR, revealed that the marine environment harbours potentially pathogenic group of bacteria with genetic diversity. The distribution of virulence associated genes in the environmental isolates of V. cholerae provides tangible material for further investigation. Nucleotide and protein sequence analysis alongwith protein structure prediction aids in better understanding of the variation inalleles of same gene in different ecological niche and its impact on the protein structure for attaining greater fitness of pathogens. The evidences of the co-evolution of virulence genes in toxigenic V. cholerae O1 from different lineages of environmental non-O1 strains is alarming. Transduction studies would indicate that the phenomenon of acquisition of these virulence genes by lateral gene transfer, although rare, is not quite uncommon amongst non-O1/non-O139 V. cholerae and it has a key role in diversification. All these considerations justify the need for an integrated approach towards the development of an effective surveillance system to monitor evolution of V. cholerae strains with epidemic potential. Results presented in this study, if considered together with the mechanism proposed as above, would strongly suggest that the bacteriophage also intervenes as a variable in shaping the cholera bacterium, which cannot be ignored and hinting at imminent future epidemics.
Resumo:
La butirilcolinesterasa humana (BChE; EC 3.1.1.8) es una enzima polimórfica sintetizada en el hígado y en el tejido adiposo, ampliamente distribuida en el organismo y encargada de hidrolizar algunos ésteres de colina como la procaína, ésteres alifáticos como el ácido acetilsalicílico, fármacos como la metilprednisolona, el mivacurium y la succinilcolina y drogas de uso y/o abuso como la heroína y la cocaína. Es codificada por el gen BCHE (OMIM 177400), habiéndose identificado más de 100 variantes, algunas no estudiadas plenamente, además de la forma más frecuente, llamada usual o silvestre. Diferentes polimorfismos del gen BCHE se han relacionado con la síntesis de enzimas con niveles variados de actividad catalítica. Las bases moleculares de algunas de esas variantes genéticas han sido reportadas, entre las que se encuentra las variantes Atípica (A), fluoruro-resistente del tipo 1 y 2 (F-1 y F-2), silente (S), Kalow (K), James (J) y Hammersmith (H). En este estudio, en un grupo de pacientes se aplicó el instrumento validado Lifetime Severity Index for Cocaine Use Disorder (LSI-C) para evaluar la gravedad del consumo de “cocaína” a lo largo de la vida. Además, se determinaron Polimorfismos de Nucleótido Simple (SNPs) en el gen BCHE conocidos como responsables de reacciones adversas en pacientes consumidores de “cocaína” mediante secuenciación del gen y se predijo el efecto delos SNPs sobre la función y la estructura de la proteína, mediante el uso de herramientas bio-informáticas. El instrumento LSI-C ofreció resultados en cuatro dimensiones: consumo a lo largo de la vida, consumo reciente, dependencia psicológica e intento de abandono del consumo. Los estudios de análisis molecular permitieron observar dos SNPs codificantes (cSNPs) no sinónimos en el 27.3% de la muestra, c.293A>G (p.Asp98Gly) y c.1699G>A (p.Ala567Thr), localizados en los exones 2 y 4, que corresponden, desde el punto de vista funcional, a la variante Atípica (A) [dbSNP: rs1799807] y a la variante Kalow (K) [dbSNP: rs1803274] de la enzima BChE, respectivamente. Los estudios de predicción In silico establecieron para el SNP p.Asp98Gly un carácter patogénico, mientras que para el SNP p.Ala567Thr, mostraron un comportamiento neutro. El análisis de los resultados permite proponer la existencia de una relación entre polimorfismos o variantes genéticas responsables de una baja actividad catalítica y/o baja concentración plasmática de la enzima BChE y algunas de las reacciones adversas ocurridas en pacientes consumidores de cocaína.
Resumo:
Most newly sequenced proteins are likely to adopt a similar structure to one which has already been experimentally determined. For this reason, the most successful approaches to protein structure prediction have been template-based methods. Such prediction methods attempt to identify and model the folds of unknown structures by aligning the target sequences to a set of representative template structures within a fold library. In this chapter, I discuss the development of template-based approaches to fold prediction, from the traditional techniques to the recent state-of-the-art methods. I also discuss the recent development of structural annotation databases, which contain models built by aligning the sequences from entire proteomes against known structures. Finally, I run through a practical step-by-step guide for aligning target sequences to known structures and contemplate the future direction of template-based structure prediction.
Resumo:
Statistical approaches have been applied to examine amino acid pairing preferences within parallel beta-sheets. The main chain hydrogen bonding pattern in parallel beta-sheets means that, for each residue pair, only one of the residues is involved in main chain hydrogen bonding with the strand containing the partner residue. We call this the hydrogen bonded (HB) residue and the partner residue the non-hydrogen bonded (nHB) residue, and differentiate between the favorability of a pair and that of its reverse pair, e.g. Asn(HB)-Thr(nHB)versus Thr(HB)-Asn(nHB). Significantly (p < or = 0.000001) favoured pairings were rationalised using stereochemical arguments. For instance, Asn(HB)-Thr(nHB) and Arg(HB)-Thr(nHB) were favoured pairs, where the residues adopted favoured chi1 rotamer positions that allowed side-chain interactions to occur. In contrast, Thr(HB)-Asn(nHB) and Thr(HB)-Arg(nHB) were not significantly favoured, and could only form side-chain interactions if the residues involved adopted less favourable chi1 conformations. The favourability of hydrophobic pairs e.g. Ile(HB)-Ile(nHB), Val(HB)-Val(nHB) and Leu(HB)-Ile(nHB) was explained by the residues adopting their most preferred chi1 and chi2 conformations, which enabled them to form nested arrangements. Cysteine-cysteine pairs are significantly favoured, although these do not form intrasheet disulphide bridges. Interactions between positively and negatively charged residues were asymmetrically preferred: those with the negatively charged residue at the HB position were more favoured. This trend was accounted for by the presence of general electrostatic interactions, which, based on analysis of distances between charged atoms, were likely to be stronger when the negatively charged residue is the HB partner. The Arg(HB)-Asp(nHB) interaction was an exception to this trend and its favorability was rationalised by the formation of specific side-chain interactions. This research provides rules that could be applied to protein structure prediction, comparative modelling and protein engineering and design. The methods used to analyse the pairing preferences are automated and detailed results are available (http://www.rubic.rdg.ac.uk/betapairprefsparallel/).
Resumo:
Statistical approaches have been applied to examine amino acid pairing preferences within parallel beta-sheets. The main chain hydrogen bonding pattern in parallel beta-sheets means that, for each residue pair, only one of the residues is involved in main chain hydrogen bonding with the strand containing the partner residue. We call this the hydrogen bonded (HB) residue and the partner residue the non-hydrogen bonded (nHB) residue, and differentiate between the favourability of a pair and that of its reverse pair, e.g. Asn(HB)-Thr(nHB) versus Thr(HB)-Asn(nHB). Significantly (p <= 0.000001) favoured pairings were rationalised using stereochemical arguments. For instance, Asn(HB)-Thr(nHB) and Arg(HB)-Thr(nHB) were favoured pairs, where the residues adopted favoured chi(1) rotamer positions that allowed side-chain interactions to occur. In contrast, Thr(HB)-Asn(nHB) and Thr(HB)-Arg(nHB) were not significantly favoured, and could only form side-chain interactions if the residues involved adopted less favourable chi(1) conformations. The favourability of hydrophobic pairs e.g. Ile(HB)-Ile(nHB), Val(HB)-Val(nHB) and Leu(HB)-Ile(nHB) was explained by the residues adopting their most preferred chi(1) and chi(2) conformations, which enabled them to form nested arrangements. Cysteine-cysteine pairs are significantly favoured, although these do not form intrasheet disulphide bridges. Interactions between positively and negatively charged residues were asymmetrically preferred: those with the negatively charged residue at the HB position were more favoured. This trend was accounted for by the presence of general electrostatic interactions, which, based on analysis of distances between charged atoms, were likely to be stronger when the negatively charged residue is the HB partner. The Arg(HB)-Asp(nHB) interaction was an exception to this trend and its favourability was rationalised by the formation of specific side-chain interactions. This research provides rules that could be applied to protein structure prediction, comparative modelling and protein engineering and design. The methods used to analyse the pairing preferences are automated and detailed results are available (http:// www.rubic.rdg.ac.uk/betapairprefsparallel/). (c) 2005 Elsevier Ltd. All rights reserved.
Resumo:
The accurate prediction of the biochemical function of a protein is becoming increasingly important, given the unprecedented growth of both structural and sequence databanks. Consequently, computational methods are required to analyse such data in an automated manner to ensure genomes are annotated accurately. Protein structure prediction methods, for example, are capable of generating approximate structural models on a genome-wide scale. However, the detection of functionally important regions in such crude models, as well as structural genomics targets, remains an extremely important problem. The method described in the current study, MetSite, represents a fully automatic approach for the detection of metal-binding residue clusters applicable to protein models of moderate quality. The method involves using sequence profile information in combination with approximate structural data. Several neural network classifiers are shown to be able to distinguish metal sites from non-sites with a mean accuracy of 94.5%. The method was demonstrated to identify metal-binding sites correctly in LiveBench targets where no obvious metal-binding sequence motifs were detectable using InterPro. Accurate detection of metal sites was shown to be feasible for low-resolution predicted structures generated using mGenTHREADER where no side-chain information was available. High-scoring predictions were observed for a recently solved hypothetical protein from Haemophilus influenzae, indicating a putative metal-binding site.
Resumo:
The results of applying a fragment-based protein tertiary structure prediction method to the prediction of 14 CASP5 target domains are described. The method is based on the assembly of supersecondary structural fragments taken from highly resolved protein structures using a simulated annealing algorithm. A number of good predictions for proteins with novel folds were produced, although not always as the first model. For two fold recognition targets, FRAGFOLD produced the most accurate model in both cases, despite the fact that the predictions were not based on a template structure. Although clear progress has been made in improving FRAGFOLD since CASP4, the ranking of final models still seems to be the main problem that needs to be addressed before the next CASP experiment
Resumo:
Motivation: Modelling the 3D structures of proteins can often be enhanced if more than one fold template is used during the modelling process. However, in many cases, this may also result in poorer model quality for a given target or alignment method. There is a need for modelling protocols that can both consistently and significantly improve 3D models and provide an indication of when models might not benefit from the use of multiple target-template alignments. Here, we investigate the use of both global and local model quality prediction scores produced by ModFOLDclust2, to improve the selection of target-template alignments for the construction of multiple-template models. Additionally, we evaluate clustering the resulting population of multi- and single-template models for the improvement of our IntFOLD-TS tertiary structure prediction method. Results: We find that using accurate local model quality scores to guide alignment selection is the most consistent way to significantly improve models for each of the sequence to structure alignment methods tested. In addition, using accurate global model quality for re-ranking alignments, prior to selection, further improves the majority of multi-template modelling methods tested. Furthermore, subsequent clustering of the resulting population of multiple-template models significantly improves the quality of selected models compared with the previous version of our tertiary structure prediction method, IntFOLD-TS.
Resumo:
The estimation of prediction quality is important because without quality measures, it is difficult to determine the usefulness of a prediction. Currently, methods for ligand binding site residue predictions are assessed in the function prediction category of the biennial Critical Assessment of Techniques for Protein Structure Prediction (CASP) experiment, utilizing the Matthews Correlation Coefficient (MCC) and Binding-site Distance Test (BDT) metrics. However, the assessment of ligand binding site predictions using such metrics requires the availability of solved structures with bound ligands. Thus, we have developed a ligand binding site quality assessment tool, FunFOLDQA, which utilizes protein feature analysis to predict ligand binding site quality prior to the experimental solution of the protein structures and their ligand interactions. The FunFOLDQA feature scores were combined using: simple linear combinations, multiple linear regression and a neural network. The neural network produced significantly better results for correlations to both the MCC and BDT scores, according to Kendall’s τ, Spearman’s ρ and Pearson’s r correlation coefficients, when tested on both the CASP8 and CASP9 datasets. The neural network also produced the largest Area Under the Curve score (AUC) when Receiver Operator Characteristic (ROC) analysis was undertaken for the CASP8 dataset. Furthermore, the FunFOLDQA algorithm incorporating the neural network, is shown to add value to FunFOLD, when both methods are employed in combination. This results in a statistically significant improvement over all of the best server methods, the FunFOLD method (6.43%), and one of the top manual groups (FN293) tested on the CASP8 dataset. The FunFOLDQA method was also found to be competitive with the top server methods when tested on the CASP9 dataset. To the best of our knowledge, FunFOLDQA is the first attempt to develop a method that can be used to assess ligand binding site prediction quality, in the absence of experimental data.
Resumo:
Protein structure prediction methods aim to predict the structures of proteins from their amino acid sequences, utilizing various computational algorithms. Structural genome annotation is the process of attaching biological information to every protein encoded within a genome via the production of three-dimensional protein models.
Resumo:
Model quality assessment programs (MQAPs) aim to assess the quality of modelled 3D protein structures. The provision of quality scores, describing both global and local (per-residue) accuracy are extremely important, as without quality scores we are unable to determine the usefulness of a 3D model for further computational and experimental wet lab studies.Here, we briefly discuss protein tertiary structure prediction, along with the biennial Critical Assessment of Techniques for Protein Structure Prediction (CASP) competition and their key role in driving the field of protein model quality assessment methods (MQAPs). We also briefly discuss the top MQAPs from the previous CASP competitions. Additionally, we describe our downloadable and webserver-based model quality assessment methods: ModFOLD3, ModFOLDclust, ModFOLDclustQ, ModFOLDclust2, and IntFOLD-QA. We provide a practical step-by-step guide on using our downloadable and webserver-based tools and include examples of their application for improving tertiary structure prediction, ligand binding site residue prediction, and oligomer predictions.