949 resultados para Protein structure prediction
Resumo:
For determining functionality dependencies between two proteins, both represented as 3D structures, it is an essential condition that they have one or more matching structural regions called patches. As 3D structures for proteins are large, complex and constantly evolving, it is computationally expensive and very time-consuming to identify possible locations and sizes of patches for a given protein against a large protein database. In this paper, we address a vector space based representation for protein structures, where a patch is formed by the vectors within the region. Based on our previews work, a compact representation of the patch named patch signature is applied here. A similarity measure of two patches is then derived based on their signatures. To achieve fast patch matching in large protein databases, a match-and-expand strategy is proposed. Given a query patch, a set of small k-sized matching patches, called candidate patches, is generated in match stage. The candidate patches are further filtered by enlarging k in expand stage. Our extensive experimental results demonstrate encouraging performances with respect to this biologically critical but previously computationally prohibitive problem.
Resumo:
After decades of slow progress, the pace of research on membrane protein structures is beginning to quicken thanks to various improvements in technology, including protein engineering and microfocus X-ray diffraction. Here we review these developments and, where possible, highlight generic new approaches to solving membrane protein structures based on recent technological advances. Rational approaches to overcoming the bottlenecks in the field are urgently required as membrane proteins, which typically comprise ~30% of the proteomes of organisms, are dramatically under-represented in the structural database of the Protein Data Bank.
Resumo:
Computing the similarity between two protein structures is a crucial task in molecular biology, and has been extensively investigated. Many protein structure comparison methods can be modeled as maximum weighted clique problems in specific k-partite graphs, referred here as alignment graphs. In this paper we present both a new integer programming formulation for solving such clique problems and a dedicated branch and bound algorithm for solving the maximum cardinality clique problem. Both approaches have been integrated in VAST, a software for aligning protein 3D structures largely used in the National Center for Biotechnology Information, an original clique solver which uses the well known Bron and Kerbosch algorithm (BK). Our computational results on real protein alignment instances show that our branch and bound algorithm is up to 116 times faster than BK.
Molecular protein function prediction using sequence similarity-based and similarity-free approaches
Resumo:
Thèse numérisée par la Direction des bibliothèques de l'Université de Montréal.
Molecular protein function prediction using sequence similarity-based and similarity-free approaches
Resumo:
Thèse numérisée par la Direction des bibliothèques de l'Université de Montréal.
Resumo:
Background: Protein structural alignment is one of the most fundamental and crucial areas of research in the domain of computational structural biology. Comparison of a protein structure with known structures helps to classify it as a new or belonging to a known group of proteins. This, in turn, is useful to determine the function of protein, its evolutionary relationship with other protein molecules and grasping principles underlying protein architecture and folding. Results: A large number of protein structure alignment methods are available. Each protein structure alignment tool has its own strengths andweaknesses that need to be highlighted.We compared and presented results of six most popular and publically available servers for protein structure comparison. These web-based servers were compared with the respect to functionality (features provided by these servers) and accuracy (how well the structural comparison is performed). The CATH was used as a reference. The results showed that overall CE was top performer. DALI and PhyreStorm showed similar results whereas PDBeFold showed the lowest performance. In case of few secondary structural elements, CE, DALI and PhyreStorm gave 100% success rate. Conclusion: Overall none of the structural alignment servers showed 100% success rate. Studies of overall performance, effect of mainly alpha and effect of mainly beta showed consistent performance. CE, DALI, FatCat and PhyreStorm showed more than 90% success rate.
Resumo:
Modeling methods to derive 3D-structure of proteins have been recently developed. Protein homology-modeling, also known as comparative protein modeling, is nowadays the most accurate protein modeling method. This technique can produce useful models for about an order of magnitude more protein sequences than there have been structures determined by experiment in the same amount of time. All current protein homology-modeling methods consist of four sequential steps: fold assignment and template selection, template-target alignment, model building, and model evaluation. In this paper we discuss in some detail the protein-homology paradigm, its predictive power and its limitations.
Resumo:
The resurgence of the enteric pathogen Vibrio cholerae, the causative organism of epidemic cholera, remains a major health problem in many developing countries like India. The southern Indian state of Kerala is endemic to cholera. The outbreaks of cholera follow a seasonal pattern in regions of endemicity. Marine aquaculture settings and mangrove environments of Kerala serve as reservoirs for V. cholerae. The non-O1/non-O139 environmental isolates of V. cholerae with incomplete ‘virulence casette’ are to be dealt with caution as they constitute a major reservoir of diverse virulence genes in the marine environment and play a crucial role in pathogenicity and horizontal gene transfer. The genes coding cholera toxin are borne on, and can be infectiously transmitted by CTXΦ, a filamentous lysogenic vibriophages. Temperate phages can provide crucial virulence and fitness factors affecting cell metabolism, bacterial adhesion, colonization, immunity, antibiotic resistance and serum resistance. The present study was an attempt to screen the marine environments like aquafarms and mangroves of coastal areas of Alappuzha and Cochin, Kerala for the presence of lysogenic V. cholerae, to study their pathogenicity and also gene transfer potential. Phenotypic and molecular methods were used for identification of isolates as V. cholerae. The thirty one isolates which were Gram negative, oxidase positive, fermentative, with or without gas production on MOF media and which showed yellow coloured colonies on TCBS (Thiosulfate Citrate Bile salt Sucrose) agar were segregated as vibrios. Twenty two environmental V. cholerae strains of both O1 and non- O1/non-O139 serogroups on induction with mitomycin C showed the presence of lysogenic phages. They produced characteristic turbid plaques in double agar overlay assay using the indicator strain V. cholerae El Tor MAK 757. PCR based molecular typing with primers targeting specific conserved sequences in the bacterial genome, demonstrated genetic diversity among these lysogen containing non-O1 V. cholerae . Polymerase chain reaction was also employed as a rapid screening method to verify the presence of 9 virulence genes namely, ctxA, ctxB, ace, hlyA, toxR, zot,tcpA, ninT and nanH, using gene specific primers. The presence of tcpA gene in ALPVC3 was alarming, as it indicates the possibility of an epidemic by accepting the cholera. Differential induction studies used ΦALPVC3, ΦALPVC11, ΦALPVC12 and ΦEKM14, underlining the possibility of prophage induction in natural ecosystems, due to abiotic factors like antibiotics, pollutants, temperature and UV. The efficiency of induction of prophages varied considerably in response to the different induction agents. The growth curve of lysogenic V. cholerae used in the study drastically varied in the presence of strong prophage inducers like antibiotics and UV. Bacterial cell lysis was directly proportional to increase in phage number due to induction. Morphological characterization of vibriophages by Transmission Electron Microscopy revealed hexagonal heads for all the four phages. Vibriophage ΦALPVC3 exhibited isometric and contractile tails characteristic of family Myoviridae, while phages ΦALPVC11 and ΦALPVC12 demonstrated the typical hexagonal head and non-contractile tail of family Siphoviridae. ΦEKM14, the podophage was distinguished by short non-contractile tail and icosahedral head. This work demonstrated that environmental parameters can influence the viability and cell adsorption rates of V. cholerae phages. Adsorption studies showed 100% adsorption of ΦALPVC3 ΦALPVC11, ΦALPVC12 and ΦEKM14 after 25, 30, 40 and 35 minutes respectively. Exposure to high temperatures ranging from 50ºC to 100ºC drastically reduced phage viability. The optimum concentration of NaCl required for survival of vibriophages except ΦEKM14 was 0.5 M and that for ΦEKM14 was 1M NaCl. Survival of phage particles was maximum at pH 7-8. V. cholerae is assumed to have existed long before their human host and so the pathogenic clones may have evolved from aquatic forms which later colonized the human intestine by progressive acquisition of genes. This is supported by the fact that the vast majority of V. cholerae strains are still part of the natural aquatic environment. CTXΦ has played a critical role in the evolution of the pathogenicity of V. cholerae as it can transmit the ctxAB gene. The unusual transformation of V. cholerae strains associated with epidemics and the emergence of V. cholera O139 demonstrates the evolutionary success of the organism in attaining greater fitness. Genetic changes in pathogenic V. cholerae constitute a natural process for developing immunity within an endemically infected population. The alternative hosts and lysogenic environmental V. cholerae strains may potentially act as cofactors in promoting cholera phage ‘‘blooms’’ within aquatic environments, thereby influencing transmission of phage sensitive, pathogenic V. cholerae strains by aquatic vehicles. Differential induction of the phages is a clear indication of the impact of environmental pollution and global changes on phage induction. The development of molecular biology techniques offered an accessible gateway for investigating the molecular events leading to genetic diversity in the marine environment. Using nucleic acids as targets, the methods of fingerprinting like ERIC PCR and BOX PCR, revealed that the marine environment harbours potentially pathogenic group of bacteria with genetic diversity. The distribution of virulence associated genes in the environmental isolates of V. cholerae provides tangible material for further investigation. Nucleotide and protein sequence analysis alongwith protein structure prediction aids in better understanding of the variation inalleles of same gene in different ecological niche and its impact on the protein structure for attaining greater fitness of pathogens. The evidences of the co-evolution of virulence genes in toxigenic V. cholerae O1 from different lineages of environmental non-O1 strains is alarming. Transduction studies would indicate that the phenomenon of acquisition of these virulence genes by lateral gene transfer, although rare, is not quite uncommon amongst non-O1/non-O139 V. cholerae and it has a key role in diversification. All these considerations justify the need for an integrated approach towards the development of an effective surveillance system to monitor evolution of V. cholerae strains with epidemic potential. Results presented in this study, if considered together with the mechanism proposed as above, would strongly suggest that the bacteriophage also intervenes as a variable in shaping the cholera bacterium, which cannot be ignored and hinting at imminent future epidemics.
Resumo:
La butirilcolinesterasa humana (BChE; EC 3.1.1.8) es una enzima polimórfica sintetizada en el hígado y en el tejido adiposo, ampliamente distribuida en el organismo y encargada de hidrolizar algunos ésteres de colina como la procaína, ésteres alifáticos como el ácido acetilsalicílico, fármacos como la metilprednisolona, el mivacurium y la succinilcolina y drogas de uso y/o abuso como la heroína y la cocaína. Es codificada por el gen BCHE (OMIM 177400), habiéndose identificado más de 100 variantes, algunas no estudiadas plenamente, además de la forma más frecuente, llamada usual o silvestre. Diferentes polimorfismos del gen BCHE se han relacionado con la síntesis de enzimas con niveles variados de actividad catalítica. Las bases moleculares de algunas de esas variantes genéticas han sido reportadas, entre las que se encuentra las variantes Atípica (A), fluoruro-resistente del tipo 1 y 2 (F-1 y F-2), silente (S), Kalow (K), James (J) y Hammersmith (H). En este estudio, en un grupo de pacientes se aplicó el instrumento validado Lifetime Severity Index for Cocaine Use Disorder (LSI-C) para evaluar la gravedad del consumo de “cocaína” a lo largo de la vida. Además, se determinaron Polimorfismos de Nucleótido Simple (SNPs) en el gen BCHE conocidos como responsables de reacciones adversas en pacientes consumidores de “cocaína” mediante secuenciación del gen y se predijo el efecto delos SNPs sobre la función y la estructura de la proteína, mediante el uso de herramientas bio-informáticas. El instrumento LSI-C ofreció resultados en cuatro dimensiones: consumo a lo largo de la vida, consumo reciente, dependencia psicológica e intento de abandono del consumo. Los estudios de análisis molecular permitieron observar dos SNPs codificantes (cSNPs) no sinónimos en el 27.3% de la muestra, c.293A>G (p.Asp98Gly) y c.1699G>A (p.Ala567Thr), localizados en los exones 2 y 4, que corresponden, desde el punto de vista funcional, a la variante Atípica (A) [dbSNP: rs1799807] y a la variante Kalow (K) [dbSNP: rs1803274] de la enzima BChE, respectivamente. Los estudios de predicción In silico establecieron para el SNP p.Asp98Gly un carácter patogénico, mientras que para el SNP p.Ala567Thr, mostraron un comportamiento neutro. El análisis de los resultados permite proponer la existencia de una relación entre polimorfismos o variantes genéticas responsables de una baja actividad catalítica y/o baja concentración plasmática de la enzima BChE y algunas de las reacciones adversas ocurridas en pacientes consumidores de cocaína.
Resumo:
Most newly sequenced proteins are likely to adopt a similar structure to one which has already been experimentally determined. For this reason, the most successful approaches to protein structure prediction have been template-based methods. Such prediction methods attempt to identify and model the folds of unknown structures by aligning the target sequences to a set of representative template structures within a fold library. In this chapter, I discuss the development of template-based approaches to fold prediction, from the traditional techniques to the recent state-of-the-art methods. I also discuss the recent development of structural annotation databases, which contain models built by aligning the sequences from entire proteomes against known structures. Finally, I run through a practical step-by-step guide for aligning target sequences to known structures and contemplate the future direction of template-based structure prediction.
Resumo:
Statistical approaches have been applied to examine amino acid pairing preferences within parallel beta-sheets. The main chain hydrogen bonding pattern in parallel beta-sheets means that, for each residue pair, only one of the residues is involved in main chain hydrogen bonding with the strand containing the partner residue. We call this the hydrogen bonded (HB) residue and the partner residue the non-hydrogen bonded (nHB) residue, and differentiate between the favorability of a pair and that of its reverse pair, e.g. Asn(HB)-Thr(nHB)versus Thr(HB)-Asn(nHB). Significantly (p < or = 0.000001) favoured pairings were rationalised using stereochemical arguments. For instance, Asn(HB)-Thr(nHB) and Arg(HB)-Thr(nHB) were favoured pairs, where the residues adopted favoured chi1 rotamer positions that allowed side-chain interactions to occur. In contrast, Thr(HB)-Asn(nHB) and Thr(HB)-Arg(nHB) were not significantly favoured, and could only form side-chain interactions if the residues involved adopted less favourable chi1 conformations. The favourability of hydrophobic pairs e.g. Ile(HB)-Ile(nHB), Val(HB)-Val(nHB) and Leu(HB)-Ile(nHB) was explained by the residues adopting their most preferred chi1 and chi2 conformations, which enabled them to form nested arrangements. Cysteine-cysteine pairs are significantly favoured, although these do not form intrasheet disulphide bridges. Interactions between positively and negatively charged residues were asymmetrically preferred: those with the negatively charged residue at the HB position were more favoured. This trend was accounted for by the presence of general electrostatic interactions, which, based on analysis of distances between charged atoms, were likely to be stronger when the negatively charged residue is the HB partner. The Arg(HB)-Asp(nHB) interaction was an exception to this trend and its favorability was rationalised by the formation of specific side-chain interactions. This research provides rules that could be applied to protein structure prediction, comparative modelling and protein engineering and design. The methods used to analyse the pairing preferences are automated and detailed results are available (http://www.rubic.rdg.ac.uk/betapairprefsparallel/).
Resumo:
Statistical approaches have been applied to examine amino acid pairing preferences within parallel beta-sheets. The main chain hydrogen bonding pattern in parallel beta-sheets means that, for each residue pair, only one of the residues is involved in main chain hydrogen bonding with the strand containing the partner residue. We call this the hydrogen bonded (HB) residue and the partner residue the non-hydrogen bonded (nHB) residue, and differentiate between the favourability of a pair and that of its reverse pair, e.g. Asn(HB)-Thr(nHB) versus Thr(HB)-Asn(nHB). Significantly (p <= 0.000001) favoured pairings were rationalised using stereochemical arguments. For instance, Asn(HB)-Thr(nHB) and Arg(HB)-Thr(nHB) were favoured pairs, where the residues adopted favoured chi(1) rotamer positions that allowed side-chain interactions to occur. In contrast, Thr(HB)-Asn(nHB) and Thr(HB)-Arg(nHB) were not significantly favoured, and could only form side-chain interactions if the residues involved adopted less favourable chi(1) conformations. The favourability of hydrophobic pairs e.g. Ile(HB)-Ile(nHB), Val(HB)-Val(nHB) and Leu(HB)-Ile(nHB) was explained by the residues adopting their most preferred chi(1) and chi(2) conformations, which enabled them to form nested arrangements. Cysteine-cysteine pairs are significantly favoured, although these do not form intrasheet disulphide bridges. Interactions between positively and negatively charged residues were asymmetrically preferred: those with the negatively charged residue at the HB position were more favoured. This trend was accounted for by the presence of general electrostatic interactions, which, based on analysis of distances between charged atoms, were likely to be stronger when the negatively charged residue is the HB partner. The Arg(HB)-Asp(nHB) interaction was an exception to this trend and its favourability was rationalised by the formation of specific side-chain interactions. This research provides rules that could be applied to protein structure prediction, comparative modelling and protein engineering and design. The methods used to analyse the pairing preferences are automated and detailed results are available (http:// www.rubic.rdg.ac.uk/betapairprefsparallel/). (c) 2005 Elsevier Ltd. All rights reserved.
Resumo:
The accurate prediction of the biochemical function of a protein is becoming increasingly important, given the unprecedented growth of both structural and sequence databanks. Consequently, computational methods are required to analyse such data in an automated manner to ensure genomes are annotated accurately. Protein structure prediction methods, for example, are capable of generating approximate structural models on a genome-wide scale. However, the detection of functionally important regions in such crude models, as well as structural genomics targets, remains an extremely important problem. The method described in the current study, MetSite, represents a fully automatic approach for the detection of metal-binding residue clusters applicable to protein models of moderate quality. The method involves using sequence profile information in combination with approximate structural data. Several neural network classifiers are shown to be able to distinguish metal sites from non-sites with a mean accuracy of 94.5%. The method was demonstrated to identify metal-binding sites correctly in LiveBench targets where no obvious metal-binding sequence motifs were detectable using InterPro. Accurate detection of metal sites was shown to be feasible for low-resolution predicted structures generated using mGenTHREADER where no side-chain information was available. High-scoring predictions were observed for a recently solved hypothetical protein from Haemophilus influenzae, indicating a putative metal-binding site.