949 resultados para Protein structure prediction
Resumo:
Elucidating the biological and biochemical roles of proteins, and subsequently determining their interacting partners, can be difficult and time consuming using in vitro and/or in vivo methods, and consequently the majority of newly sequenced proteins will have unknown structures and functions. However, in silico methods for predicting protein–ligand binding sites and protein biochemical functions offer an alternative practical solution. The characterisation of protein–ligand binding sites is essential for investigating new functional roles, which can impact the major biological research spheres of health, food, and energy security. In this review we discuss the role in silico methods play in 3D modelling of protein–ligand binding sites, along with their role in predicting biochemical functionality. In addition, we describe in detail some of the key alternative in silico prediction approaches that are available, as well as discussing the Critical Assessment of Techniques for Protein Structure Prediction (CASP) and the Continuous Automated Model EvaluatiOn (CAMEO) projects, and their impact on developments in the field. Furthermore, we discuss the importance of protein function prediction methods for tackling 21st century problems.
Resumo:
Protein–ligand binding site prediction methods aim to predict, from amino acid sequence, protein–ligand interactions, putative ligands, and ligand binding site residues using either sequence information, structural information, or a combination of both. In silico characterization of protein–ligand interactions has become extremely important to help determine a protein’s functionality, as in vivo-based functional elucidation is unable to keep pace with the current growth of sequence databases. Additionally, in vitro biochemical functional elucidation is time-consuming, costly, and may not be feasible for large-scale analysis, such as drug discovery. Thus, in silico prediction of protein–ligand interactions must be utilized to aid in functional elucidation. Here, we briefly discuss protein function prediction, prediction of protein–ligand interactions, the Critical Assessment of Techniques for Protein Structure Prediction (CASP) and the Continuous Automated EvaluatiOn (CAMEO) competitions, along with their role in shaping the field. We also discuss, in detail, our cutting-edge web-server method, FunFOLD for the structurally informed prediction of protein–ligand interactions. Furthermore, we provide a step-by-step guide on using the FunFOLD web server and FunFOLD3 downloadable application, along with some real world examples, where the FunFOLD methods have been used to aid functional elucidation.
Resumo:
The PilZ protein was originally identified as necessary for type IV pilus (T4P) biogenesis. Since then, a large and diverse family of bacterial PilZ homology domains have been identified, some of which have been implicated in signaling pathways that control important processes, including motility, virulence and biofilm formation. Furthermore, many PilZ homology domains, though not PilZ itself, have been shown to bind the important bacterial second messenger bis(3`-> 5`)cyclic diGMP (c-diGMP). The crystal structures of the PilZ orthologs from Xanthomonas axonopodis pv Citri (PilZ(XAC1133), this work) and from Xanthomonas campestris pv campestris (XC1028) present significant structural differences to other PilZ homologs that explain its failure to bind c-diGMP. NMR analysis of PilZ(XAC1133) shows that these structural differences are maintained in solution. In spite of their emerging importance in bacterial signaling, the means by which NZ proteins regulate specific processes is not clear. In this study, we show that PilZ(XAC1133) binds to PilB, an ATPase required for TV polymerization, and to the EAL domain of FiMX(XAC2398), which regulates TV biogenesis and localization in other bacterial species. These interactions were confirmed in NMR, two-hybrid and far-Western blot assays and are the first interactions observed between any PilZ domain and a target protein. While we were unable to detect phosphodiesterase activity for FimXX(AC2398) in vitro, we show that it binds c-diGMP both in the presence and in the absence of PilZ(XAC1133). Site-directed mutagenesis studies for conserved and exposed residues suggest that PilZ(XAC1133) interactions with FimX(XAC2398) and PilB(XAC3239) are mediated through a hydrophobic surface and an unstructured C-terminal extension conserved only in PilZ orthologs. The FimX-PilZ-PilB interactions involve a full set of ""degenerate"" GGDEF, EAL and PilZ domains and provide the first evidence of the means by which PilZ orthologs and FimX interact directly with the TP4 machinery. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
In this work, genetic algorithms concepts along with a rotamer library for proteins side chains are used to optimize the tertiary structure of the hydrophobic core of Cytochrome b(562) starting from the known PDB structure of its backbone which is kept fixed while the side chains of the hydrophobic core are allowed to adopt the conformations present in the rotamer library. The atoms of the side chains forming the core interact via van der Waals energy. Besides the prediction of the native core structure, it is also suggested a set of different amino acid sequences for this core. Comparison between these new cores and the native are made in terms of their volumes, van der Waals energies values and the numbers of contacts made by the side chains forming the cores. This paper proves that genetic algorithms area efficient to design new sequence for the protein core. (C) 2007 Elsevier B.V. All rights reserved.
Resumo:
The goal of this thesis work is to develop a computational method based on machine learning techniques for predicting disulfide-bonding states of cysteine residues in proteins, which is a sub-problem of a bigger and yet unsolved problem of protein structure prediction. Improvement in the prediction of disulfide bonding states of cysteine residues will help in putting a constraint in the three dimensional (3D) space of the respective protein structure, and thus will eventually help in the prediction of 3D structure of proteins. Results of this work will have direct implications in site-directed mutational studies of proteins, proteins engineering and the problem of protein folding. We have used a combination of Artificial Neural Network (ANN) and Hidden Markov Model (HMM), the so-called Hidden Neural Network (HNN) as a machine learning technique to develop our prediction method. By using different global and local features of proteins (specifically profiles, parity of cysteine residues, average cysteine conservation, correlated mutation, sub-cellular localization, and signal peptide) as inputs and considering Eukaryotes and Prokaryotes separately we have reached to a remarkable accuracy of 94% on cysteine basis for both Eukaryotic and Prokaryotic datasets, and an accuracy of 90% and 93% on protein basis for Eukaryotic dataset and Prokaryotic dataset respectively. These accuracies are best so far ever reached by any existing prediction methods, and thus our prediction method has outperformed all the previously developed approaches and therefore is more reliable. Most interesting part of this thesis work is the differences in the prediction performances of Eukaryotes and Prokaryotes at the basic level of input coding when ‘profile’ information was given as input to our prediction method. And one of the reasons for this we discover is the difference in the amino acid composition of the local environment of bonded and free cysteine residues in Eukaryotes and Prokaryotes. Eukaryotic bonded cysteine examples have a ‘symmetric-cysteine-rich’ environment, where as Prokaryotic bonded examples lack it.
Resumo:
Structure and folding of membrane proteins are important issues in molecular and cell biology. In this work new approaches are developed to characterize the structure of folded, unfolded and partially folded membrane proteins. These approaches combine site-directed spin labeling and pulse EPR techniques. The major plant light harvesting complex LHCIIb was used as a model system. Measurements of longitudinal and transversal relaxation times of electron spins and of hyperfine couplings to neighboring nuclei by electron spin echo envelope modulation(ESEEM) provide complementary information about the local environment of a single spin label. By double electron electron resonance (DEER) distances in the nanometer range between two spin labels can be determined. The results are analyzed in terms of relative water accessibilities of different sites in LHCIIb and its geometry. They reveal conformational changes as a function of micelle composition. This arsenal of methods is used to study protein folding during the LHCIIb self assembly and a spatially and temporally resolved folding model is proposed. The approaches developed here are potentially applicable for studying structure and folding of any protein or other self-assembling structure if site-directed spin labeling is feasible and the time scale of folding is accessible to freeze-quench techniques.
Resumo:
We have quantitated the degree of structural preservation in cryo-sections of a vitrified biological specimen. Previous studies have used sections of periodic specimens to assess the resolution present, but preservation before sectioning was not assessed and so the damage due particularly to cutting was not clear. In this study large single crystals of lysozyme were vitrified and from these X-ray diffraction patterns extending to better than 2.1A were obtained. The crystals were high pressure frozen in 30% dextran, and cryo-sectioned using a diamond knife. In the best case, preservation to a resolution of 7.9A was shown by electron diffraction, the first observation of sub-nanometre structural preservation in a vitreous section.
Resumo:
We present evidence that the size of an active site side chain may modulate the degree of hydrogen tunneling in an enzyme-catalyzed reaction. Primary and secondary kH/kT and kD/kT kinetic isotope effects have been measured for the oxidation of benzyl alcohol catalyzed by horse liver alcohol dehydrogenase at 25°C. As reported in earlier studies, the relationship between secondary kH/kT and kD/kT isotope effects provides a sensitive probe for deviations from classical behavior. In the present work, catalytic efficiency and the extent of hydrogen tunneling have been correlated for the alcohol dehydrogenase-catalyzed hydride transfer among a group of site-directed mutants at position 203. Val-203 interacts with the opposite face of the cofactor NAD+ from the alcohol substrate. The reduction in size of this residue is correlated with diminished tunneling and a two orders of magnitude decrease in catalytic efficiency. Comparison of the x-ray crystal structures of a ternary complex of a high-tunneling (Phe-93 → Trp) and a low-tunneling (Val-203 → Ala) mutant provides a structural basis for the observed effects, demonstrating an increase in the hydrogen transfer distance for the low-tunneling mutant. The Val-203 → Ala ternary complex crystal structure also shows a hyperclosed interdomain geometry relative to the wild-type and the Phe-93 → Trp mutant ternary complex structures. This demonstrates a flexibility in interdomain movement that could potentially narrow the distance between the donor and acceptor carbons in the native enzyme and may enhance the role of tunneling in the hydride transfer reaction.
Resumo:
Site-directed mutagenesis and combinatorial libraries are powerful tools for providing information about the relationship between protein sequence and structure. Here we report two extensions that expand the utility of combinatorial mutagenesis for the quantitative assessment of hypotheses about the determinants of protein structure. First, we show that resin-splitting technology, which allows the construction of arbitrarily complex libraries of degenerate oligonucleotides, can be used to construct more complex protein libraries for hypothesis testing than can be constructed from oligonucleotides limited to degenerate codons. Second, using eglin c as a model protein, we show that regression analysis of activity scores from library data can be used to assess the relative contributions to the specific activity of the amino acids that were varied in the library. The regression parameters derived from the analysis of a 455-member sample from a library wherein four solvent-exposed sites in an α-helix can contain any of nine different amino acids are highly correlated (P < 0.0001, R2 = 0.97) to the relative helix propensities for those amino acids, as estimated by a variety of biophysical and computational techniques.
Resumo:
The function of a protein generally is determined by its three-dimensional (3D) structure. Thus, it would be useful to know the 3D structure of the thousands of protein sequences that are emerging from the many genome projects. To this end, fold assignment, comparative protein structure modeling, and model evaluation were automated completely. As an illustration, the method was applied to the proteins in the Saccharomyces cerevisiae (baker’s yeast) genome. It resulted in all-atom 3D models for substantial segments of 1,071 (17%) of the yeast proteins, only 40 of which have had their 3D structure determined experimentally. Of the 1,071 modeled yeast proteins, 236 were related clearly to a protein of known structure for the first time; 41 of these previously have not been characterized at all.
Resumo:
The database reported here is derived using the Combinatorial Extension (CE) algorithm which compares pairs of protein polypeptide chains and provides a list of structurally similar proteins along with their structure alignments. Using CE, structure–structure alignments can provide insights into biological function. When a protein of known function is shown to be structurally similar to a protein of unknown function, a relationship might be inferred; a relationship not necessarily detectable from sequence comparison alone. Establishing structure–structure relationships in this way is of great importance as we enter an era of structural genomics where there is a likelihood of an increasing number of structures with unknown functions being determined. Thus the CE database is an example of a useful tool in the annotation of protein structures of unknown function. Comparisons can be performed on the complete PDB or on a structurally representative subset of proteins. The source protein(s) can be from the PDB (updated monthly) or uploaded by the user. CE provides sequence alignments resulting from structural alignments and Cartesian coordinates for the aligned structures, which may be analyzed using the supplied Compare3D Java applet, or downloaded for further local analysis. Searches can be run from the CE web site, http://cl.sdsc.edu/ce.html, or the database and software downloaded from the site for local use.
Resumo:
The RESID Database is a comprehensive collection of annotations and structures for protein post-translational modifications including N-terminal, C-terminal and peptide chain cross-link modifications. The RESID Database includes systematic and frequently observed alternate names, Chemical Abstracts Service registry numbers, atomic formulas and weights, enzyme activities, taxonomic range, keywords, literature citations with database cross-references, structural diagrams and molecular models. The NRL-3D Sequence–Structure Database is derived from the three-dimensional structure of proteins deposited with the Research Collaboratory for Structural Bioinformatics Protein Data Bank. The NRL-3D Database includes standardized and frequently observed alternate names, sources, keywords, literature citations, experimental conditions and searchable sequences from model coordinates. These databases are freely accessible through the National Cancer Institute–Frederick Advanced Biomedical Computing Center at these web sites: http://www.ncifcrf.gov/RESID, http://www.ncifcrf.gov/ NRL-3D; or at these National Biomedical Research Foundation Protein Information Resource web sites: http://pir.georgetown.edu/pirwww/dbinfo/resid.html, http://pir.georgetown.edu/pirwww/dbinfo/nrl3d.html
Resumo:
Local protein structure prediction efforts have consistently failed to exceed approximately 70% accuracy. We characterize the degeneracy of the mapping from local sequence to local structure responsible for this failure by investigating the extent to which similar sequence segments found in different proteins adopt similar three-dimensional structures. Sequence segments 3-15 residues in length from 154 different protein families are partitioned into neighborhoods containing segments with similar sequences using cluster analysis. The consistency of the sequence-to-structure mapping is assessed by comparing the local structures adopted by sequence segments in the same neighborhood in proteins of known structure. In the 154 families, 45% and 28% of the positions occur in neighborhoods in which one and two local structures predominate, respectively. The sequence patterns that characterize the neighborhoods in the first class probably include virtually all of the short sequence motifs in proteins that consistently occur in a particular local structure. These patterns, many of which occur in transitions between secondary structural elements, are an interesting combination of previously studied and novel motifs. The identification of sequence patterns that consistently occur in one or a small number of local structures in proteins should contribute to the prediction of protein structure from sequence.
Resumo:
A class of potent nonpeptidic inhibitors of human immunodeficiency virus protease has been designed by using the three-dimensional structure of the enzyme as a guide. By employing iterative protein cocrystal structure analysis, design, and synthesis the binding affinity of the lead compound was incrementally improved by over four orders of magnitude. An inversion in inhibitor binding mode was observed crystallographically, providing information critical for subsequent design and highlighting the utility of structural feedback in inhibitor optimization. These inhibitors are selective for the viral protease enzyme, possess good antiviral activity, and are orally available in three species.
Resumo:
Bacterial chaperonin, GroEL, together with its co-chaperonin, GroES, facilitates the folding of a variety of polypeptides. Experiments suggest that GroEL stimulates protein folding by multiple cycles of binding and release. Misfolded proteins first bind to an exposed hydrophobic surface on GroEL. GroES then encapsulates the substrate and triggers its release into the central cavity of the GroEL/ES complex for folding. In this work, we investigate the possibility to facilitate protein folding in molecular dynamics simulations by mimicking the effects of GroEL/ES namely, repeated binding and release, together with spatial confinement. During the binding stage, the (metastable) partially folded proteins are allowed to attach spontaneously to a hydrophobic surface within the simulation box. This destabilizes the structures, which are then transferred into a spatially confined cavity for folding. The approach has been tested by attempting to refine protein structural models generated using the ROSETTA procedure for ab initio structure prediction. Dramatic improvements in regard to the deviation of protein models from the corresponding experimental structures were observed. The results suggest that the primary effects of the GroEL/ES system can be mimicked in a simple coarse-grained manner and be used to facilitate protein folding in molecular dynamics simulations. Furthermore, the results Sur port the assumption that the spatial confinement in GroEL/ES assists the folding of encapsulated proteins.