971 resultados para Protein Structures
Resumo:
World-wide structural genomics initiatives are rapidly accumulating structures for which limited functional information is available. Additionally, state-of-the art structural prediction programs are now capable of generating at least low resolution structural models of target proteins. Accurate detection and classification of functional sites within both solved and modelled protein structures therefore represents an important challenge. We present a fully automatic site detection method, FuncSite, that uses neural network classifiers to predict the location and type of functionally important sites in protein structures. The method is designed primarily to require only backbone residue positions without the need for specific side-chain atoms to be present. In order to highlight effective site detection in low resolution structural models FuncSite was used to screen model proteins generated using mGenTHREADER on a set of newly released structures. We found effective metal site detection even for moderate quality protein models illustrating the robustness of the method.
Resumo:
Protein structure prediction methods aim to predict the structures of proteins from their amino acid sequences, utilizing various computational algorithms. Structural genome annotation is the process of attaching biological information to every protein encoded within a genome via the production of three-dimensional protein models.
Resumo:
Model quality assessment programs (MQAPs) aim to assess the quality of modelled 3D protein structures. The provision of quality scores, describing both global and local (per-residue) accuracy are extremely important, as without quality scores we are unable to determine the usefulness of a 3D model for further computational and experimental wet lab studies.Here, we briefly discuss protein tertiary structure prediction, along with the biennial Critical Assessment of Techniques for Protein Structure Prediction (CASP) competition and their key role in driving the field of protein model quality assessment methods (MQAPs). We also briefly discuss the top MQAPs from the previous CASP competitions. Additionally, we describe our downloadable and webserver-based model quality assessment methods: ModFOLD3, ModFOLDclust, ModFOLDclustQ, ModFOLDclust2, and IntFOLD-QA. We provide a practical step-by-step guide on using our downloadable and webserver-based tools and include examples of their application for improving tertiary structure prediction, ligand binding site residue prediction, and oligomer predictions.
Resumo:
IntFOLD is an independent web server that integrates our leading methods for structure and function prediction. The server provides a simple unified interface that aims to make complex protein modelling data more accessible to life scientists. The server web interface is designed to be intuitive and integrates a complex set of quantitative data, so that 3D modelling results can be viewed on a single page and interpreted by non-expert modellers at a glance. The only required input to the server is an amino acid sequence for the target protein. Here we describe major performance and user interface updates to the server, which comprises an integrated pipeline of methods for: tertiary structure prediction, global and local 3D model quality assessment, disorder prediction, structural domain prediction, function prediction and modelling of protein-ligand interactions. The server has been independently validated during numerous CASP (Critical Assessment of Techniques for Protein Structure Prediction) experiments, as well as being continuously evaluated by the CAMEO (Continuous Automated Model Evaluation) project. The IntFOLD server is available at: http://www.reading.ac.uk/bioinf/IntFOLD/
Resumo:
The calculation of projection structures (PSs) from Protein Data Bank (PDB)-coordinate files of membrane proteins is not well-established. Reports on such attempts exist but are rare. In addition, the different procedures are barely described and thus difficult if not impossible to reproduce. Here we present a simple, fast and well-documented method for the calculation and visualization of PSs from PDB-coordinate files of membrane proteins: the projection structure visualization (PSV)-method. The PSV-method was successfully validated using the PS of aquaporin-1 (AQP1) from 2D crystals and cryo-transmission electron microscopy, and the PDB-coordinate file of AQP1 determined from 3D crystals and X-ray crystallography. Besides AQP1, which is a relatively rigid protein, we also studied a flexible membrane transport protein, i.e. the L-arginine/agmatine antiporter AdiC. Comparison of PSs calculated from the existing PDB-coordinate files of substrate-free and L-arginine-bound AdiC indicated that conformational changes are detected in projection. Importantly, structural differences were found between the PSV-method calculated PSs of the detergent-solubilized AdiC proteins and the PS from cryo-TEM of membrane-embedded AdiC. These differences are particularly exciting since they may reflect a different conformation of AdiC induced by the lateral pressure in the lipid bilayer.
Resumo:
A hierarchy of residue density assessments and packing properties in protein structures are contrasted, including a regular density, a variety of charge densities, a hydrophobic density, a polar density, and an aromatic density. These densities are investigated by alternative distance measures and also at the interface of multiunit structures. Amino acids are divided into nine structural categories according to three secondary structure states and three solvent accessibility levels. To take account of amino acid abundance differences across protein structures, we normalize the observed density by the expected density defining a density index. Solvent accessibility levels exert the predominant influence in determinations of the regular residue density. Explicitly, the regular density values vary approximately linearly with respect to solvent accessibility levels, the linearity parameters depending on the amino acid. The charge index reveals pronounced inequalities between lysine and arginine in their interactions with acidic residues. The aromatic density calculations in all structural categories parallel the regular density calculations, indicating that the aromatic residues are distributed as a random sample of all residues. Moreover, aromatic residues are found to be over-represented in the neighborhood of all amino acids. This result might be attributed to nucleation sites and protein stability being substantially associated with aromatic residues.
Resumo:
The residue environment in protein structures is studied with respect to the density of carbon (C), oxygen (O), and nitrogen (N) atoms within a certain distance (say 5 Å) of each residue. Two types of environments are evaluated: one based on side-chain atom contacts (abbreviated S-S) and the other based on all atom (side-chain + backbone) contacts (abbreviated A-A). Different atom counts are observed about nine-residue structural categories defined by three solvent accessibility levels and three secondary structure states. Among the structural categories, the S-S atom count ratios generally vary more than the A-A atom count ratios because of the fact that the backbone (O) and (N) atoms contribute equal counts. Secondary structure affects the (C) density for the A-A contacts whereas secondary structure has little influence on the (C) density for the S-S contacts. For S-S contacts, a greater density of (O) over (N) atom neighbors stands out in the environment of most amino acid types. By contrast, for A-A contacts, independent of the solvent accessibility levels, the ratio (O)/(N) is ≈1 in helical states, consistent with the geometry of α-helical residues whose side-chains tilt oppositely to the amino to carboxy α-helical axis. The highest ratio of neighbor (O)/(N) is achieved under solvent exposed conditions. This (O) vs. (N) prevalence is advantageous at the protein surface that generally exhibits an acid excess that helps to enhance protein solubility in the cell and to avoid nonspecific interactions with phosphate groups of DNA, RNA, and other plasma constituents.
Resumo:
The objectives of this and the following paper are to identify commonalities and disparities of the extended environment of mononuclear metal sites centering on Cu, Fe, Mn, and Zn. The extended environment of a metal site within a protein embodies at least three layers: the metal core, the ligand group, and the second shell, which is defined here to consist of all residues distant less than 3.5 Å from some ligand of the metal core. The ligands and second-shell residues can be characterized in terms of polarity, hydrophobicity, secondary structures, solvent accessibility, hydrogen-bonding interactions, and membership in statistically significant residue clusters of different kinds. Findings include the following: (i) Both histidine ligands of type I copper ions exclusively attach the Nδ1 nitrogen of the histidine imidazole ring to the metal, whereas histidine ligands for all mononuclear iron ions and nearly all type II copper ions are ligated via the Nɛ2 nitrogen. By contrast, multinuclear copper centers are coordinated predominantly by histidine Nɛ2, whereas diiron histidine contacts are predominantly Nδ1. Explanations in terms of steric differences between Nδ1 and Nɛ2 are considered. (ii) Except for blue copper (type I), the second-shell composition favors polar residues. (iii) For blue copper, the second shell generally contains multiple methionine residues, which are elements of a statistically significant histidine–cysteine–methionine cluster. Almost half of the second shell of blue copper consists of solvent-accessible residues, putatively facilitating electron transfer. (iv) Mononuclear copper atoms are never found with acidic carboxylate ligands, whereas single Mn2+ ion ligands are predominantly acidic and the second shell tends to be mostly buried. (v) The extended environment of mononuclear Fe sites often is associated with histidine–tyrosine or histidine–acidic clusters.
Resumo:
Our study of the extended metal environment, particularly of the second shell, focuses in this paper on zinc sites. Key findings include: (i) The second shell of mononuclear zinc centers is generally more polar than hydrophobic and prominently features charged residues engaged in an abundance of hydrogen bonding with histidine ligands. Histidine–acidic or histidine–tyrosine clusters commonly overlap the environment of zinc ions. (ii) Histidine tautomeric metal bonding patterns in ligating zinc ions are mixed. For example, carboxypeptidase A, thermolysin, and sonic hedgehog possess the same ligand group (two histidines, one unibidentate acidic ligand, and a bound water), but their histidine tautomeric geometries markedly differ such that the carboxypeptidase A makes only Nδ1 contacts, thermolysin makes only Nɛ2 contacts, and sonic hedgehog uses one of each. Thus the presence of a similar ligand cohort does not necessarily imply the same topology or function at the active site. (iii) Two close histidine ligands HXmH, m ≤ 5, rarely both coordinate a single metal ion in the Nδ1 tautomeric conformation, presumably to avoid steric conflicts. Mononuclear zinc sites can be classified into six types depending on the ligand composition and geometry. Implications of the results are discussed in terms of divergent and convergent evolution.
Resumo:
PALI (release 1.2) contains three-dimensional (3-D) structure-dependent sequence alignments as well as structure-based phylogenetic trees of homologous protein domains in various families. The data set of homologous protein structures has been derived by consulting the SCOP database (release 1.50) and the data set comprises 604 families of homologous proteins involving 2739 protein domain structures with each family made up of at least two members. Each member in a family has been structurally aligned with every other member in the same family (pairwise alignment) and all the members in the family are also aligned using simultaneous superposition (multiple alignment). The structural alignments are performed largely automatically, with manual interventions especially in the cases of distantly related proteins, using the program STAMP (version 4.2). Every family is also associated with two dendrograms, calculated using PHYLIP (version 3.5), one based on a structural dissimilarity metric defined for every pairwise alignment and the other based on similarity of topologically equivalent residues. These dendrograms enable easy comparison of sequence and structure-based relationships among the members in a family. Structure-based alignments with the details of structural and sequence similarities, superposed coordinate sets and dendrograms can be accessed conveniently using a web interface. The database can be queried for protein pairs with sequence or structural similarities falling within a specified range. Thus PALI forms a useful resource to help in analysing the relationship between sequence and structure variation at a given level of sequence similarity. PALI also contains over 653 ‘orphans’ (single member families). Using the web interface involving PSI_BLAST and PHYLIP it is possible to associate the sequence of a new protein with one of the families in PALI and generate a phylogenetic tree combining the query sequence and proteins of known 3-D structure. The database with the web interfaced search and dendrogram generation tools can be accessed at http://pa uling.mbu.iisc.ernet.in/~pali.
Resumo:
It is generally accepted that globular proteins fold with a hydrophobic core and a hydrophilic exterior. Might the spatial distribution of amino acid hydrophobicity exhibit common features? The hydrophobic profile detailing this distribution from the protein interior to exterior has been examined for 30 relatively diverse structures obtained from the Protein Data Bank, for 3 proteins of the 30S ribosomal subunit, and for a simple set of 14 decoys. A second-order hydrophobic moment has provided a simple measure of the spatial variation. Shapes of the calculated spatial profiles of all native structures have been found to be comparable. Consequently, profile shapes as well as particular profile features should assist in validating predicted protein structures and in discriminating between different protein-folding pathways. The spatial profiles of the 14 decoys are clearly distinguished from the profiles of their native structures.
Resumo:
We present a method (ENERGI) for extracting energy-like quantities from a data base of protein structures. In this paper, we use the method to generate pairwise additive amino acid "energy" scores. These scores are obtained by iteration until they correctly discriminate a set of known protein folds from decoy conformations. The method succeeds in lattice model tests and in the gapless threading problem as defined by Maiorov and Crippen [Maiorov, V. N. & Crippen, G. M. (1992) J. Mol. Biol. 227, 876-888]. A more challenging test of threading a larger set of test proteins derived from the representative set of Hobohm and Sander [Hobohm, U. & Sander, C. (1994) Protein Sci. 3, 522-524] is used as a "workbench" for exploring how the ENERGI scores depend on their parameter sets.
Resumo:
Structurally neighboring residues are categorized according to their separation in the primary sequence as proximal (1-4 positions apart) and otherwise distal, which in turn is divided into near (5-20 positions), far (21-50 positions), very far ( > 50 positions), and interchain (from different chains of the same structure). These categories describe the linear distance histogram (LDH) for three-dimensional neighboring residue types. Among the main results are the following: (i) nearest-neighbor hydrophobic residues tend to be increasingly distally separated in the linear sequence, thus most often connecting distinct secondary structure units. (ii) The LDHs of oppositely charged nearest-neighbors emphasize proximal positions with a subsidiary maximum for very far positions. (iii) Cysteine-cysteine structural interactions rarely involve proximal positions. (iv) The greatest numbers of interchain specific nearest-neighbors in protein structures are composed of oppositely charged residues. (v) The largest fraction of side-chain neighboring residues from beta-strands involves near positions, emphasizing associations between consecutive strands. (vi) Exposed residue pairs are predominantly located in proximal linear positions, while buried residue pairs principally correspond to far or very far distal positions. The results are principally invariant to protein sizes, amino acid usages, linear distance normalizations, and over- and underrepresentations among nearest-neighbor types. Interpretations and hypotheses concerning the LDHs, particularly those of hydrophobic and charged pairings, are discussed with respect to protein stability and functionality. The pronounced occurrence of oppositely charged interchain contacts is consistent with many observations on protein complexes where multichain stabilization is facilitated by electrostatic interactions.