977 resultados para Residue Contacts


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Protein folding can be described in terms of the development of specific contacts between residues as a highly disordered polypeptide chain converts into the native state. Here we describe an NMR based strategy designed to detect such contacts by observation of nuclear Overhauser effects (NOEs). Experiments with α-lactalbumin reveal the existence of extensive NOEs between aromatic and aliphatic protons in the archetypal molten globule formed by this protein at low pH. Analysis of their time development provides direct evidence for near-native compactness of this state. Through a rapid refolding procedure the NOE intensity can be transferred efficiently into the resolved and assigned spectrum of the native state. This demonstrates the viability of using this approach to map out time-averaged interactions between residues in a partially folded protein.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Identification of residue-residue contacts from primary sequence can be used to guide protein structure prediction. Using Escherichia coli CcdB as the test case, we describe an experimental method termed saturation-suppressor mutagenesis to acquire residue contact information. In this methodology, for each of five inactive CcdB mutants, exhaustive screens for suppressors were performed. Proximal suppressors were accurately discriminated from distal suppressors based on their phenotypes when present as single mutants. Experimentally identified putative proximal pairs formed spatial constraints to recover >98% of native-like models of CcdB from a decoy dataset. Suppressor methodology was also applied to the integral membrane protein, diacylglycerol kinase A where the structures determined by X-ray crystallography and NMR were significantly different. Suppressor as well as sequence co-variation data clearly point to the Xray structure being the functional one adopted in vivo. The methodology is applicable to any macromolecular system for which a convenient phenotypic assay exists.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In this study, we carried out a comparative analysis between two classical methodologies to prospect residue contacts in proteins: the traditional cutoff dependent (CD) approach and cutoff free Delaunay tessellation (DT). In addition, two alternative coarse-grained forms to represent residues were tested: using alpha carbon (CA) and side chain geometric center (GC). A database was built, comprising three top classes: all alpha, all beta, and alpha/beta. We found that the cutoff value? at about 7.0 A emerges as an important distance parameter.? Up to 7.0 A, CD and DT properties are unified, which implies that at this distance all contacts are complete and legitimate (not occluded). We also have shown that DT has an intrinsic missing edges problem when mapping the first layer of neighbors. In proteins, it may produce systematic errors affecting mainly the contact network in beta chains with CA. The almost-Delaunay (AD) approach has been proposed to solve this DT problem. We found that even AD may not be an advantageous solution. As a consequence, in the strict range up ? to 7.0 A, the CD approach revealed to be a simpler, more complete, and reliable technique than DT or AD. Finally, we have shown that coarse-grained residue representations may introduce bias in the analysis of neighbors in cutoffs up to ? 6.8 A, with CA favoring alpha proteins and GC favoring beta proteins. This provides an additional argument pointing to ? the value of 7.0 A as an important lower bound cutoff to be used in contact analysis of proteins.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The structural annotation of proteins with no detectable homologs of known 3D structure identified using sequence-search methods is a major challenge today. We propose an original method that computes the conditional probabilities for the amino-acid sequence of a protein to fit to known protein 3D structures using a structural alphabet, known as Protein Blocks (PBs). PBs constitute a library of 16 local structural prototypes that approximate every part of protein backbone structures. It is used to encode 3D protein structures into 1D PB sequences and to capture sequence to structure relationships. Our method relies on amino acid occurrence matrices, one for each PB, to score global and local threading of query amino acid sequences to protein folds encoded into PB sequences. It does not use any information from residue contacts or sequence-search methods or explicit incorporation of hydrophobic effect. The performance of the method was assessed with independent test datasets derived from SCOP 1.75A. With a Z-score cutoff that achieved 95% specificity (i.e., less than 5% false positives), global and local threading showed sensitivity of 64.1% and 34.2%, respectively. We further tested its performance on 57 difficult CASP10 targets that had no known homologs in PDB: 38 compatible templates were identified by our approach and 66% of these hits yielded correctly predicted structures. This method scales-up well and offers promising perspectives for structural annotations at genomic level. It has been implemented in the form of a web-server that is freely available at http://www.bo-protscience.fr/forsa.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The vast majority of known proteins have not yet been experimentally characterized and little is known about their function. The design and implementation of computational tools can provide insight into the function of proteins based on their sequence, their structure, their evolutionary history and their association with other proteins. Knowledge of the three-dimensional (3D) structure of a protein can lead to a deep understanding of its mode of action and interaction, but currently the structures of <1% of sequences have been experimentally solved. For this reason, it became urgent to develop new methods that are able to computationally extract relevant information from protein sequence and structure. The starting point of my work has been the study of the properties of contacts between protein residues, since they constrain protein folding and characterize different protein structures. Prediction of residue contacts in proteins is an interesting problem whose solution may be useful in protein folding recognition and de novo design. The prediction of these contacts requires the study of the protein inter-residue distances related to the specific type of amino acid pair that are encoded in the so-called contact map. An interesting new way of analyzing those structures came out when network studies were introduced, with pivotal papers demonstrating that protein contact networks also exhibit small-world behavior. In order to highlight constraints for the prediction of protein contact maps and for applications in the field of protein structure prediction and/or reconstruction from experimentally determined contact maps, I studied to which extent the characteristic path length and clustering coefficient of the protein contacts network are values that reveal characteristic features of protein contact maps. Provided that residue contacts are known for a protein sequence, the major features of its 3D structure could be deduced by combining this knowledge with correctly predicted motifs of secondary structure. In the second part of my work I focused on a particular protein structural motif, the coiled-coil, known to mediate a variety of fundamental biological interactions. Coiled-coils are found in a variety of structural forms and in a wide range of proteins including, for example, small units such as leucine zippers that drive the dimerization of many transcription factors or more complex structures such as the family of viral proteins responsible for virus-host membrane fusion. The coiled-coil structural motif is estimated to account for 5-10% of the protein sequences in the various genomes. Given their biological importance, in my work I introduced a Hidden Markov Model (HMM) that exploits the evolutionary information derived from multiple sequence alignments, to predict coiled-coil regions and to discriminate coiled-coil sequences. The results indicate that the new HMM outperforms all the existing programs and can be adopted for the coiled-coil prediction and for large-scale genome annotation. Genome annotation is a key issue in modern computational biology, being the starting point towards the understanding of the complex processes involved in biological networks. The rapid growth in the number of protein sequences and structures available poses new fundamental problems that still deserve an interpretation. Nevertheless, these data are at the basis of the design of new strategies for tackling problems such as the prediction of protein structure and function. Experimental determination of the functions of all these proteins would be a hugely time-consuming and costly task and, in most instances, has not been carried out. As an example, currently, approximately only 20% of annotated proteins in the Homo sapiens genome have been experimentally characterized. A commonly adopted procedure for annotating protein sequences relies on the "inheritance through homology" based on the notion that similar sequences share similar functions and structures. This procedure consists in the assignment of sequences to a specific group of functionally related sequences which had been grouped through clustering techniques. The clustering procedure is based on suitable similarity rules, since predicting protein structure and function from sequence largely depends on the value of sequence identity. However, additional levels of complexity are due to multi-domain proteins, to proteins that share common domains but that do not necessarily share the same function, to the finding that different combinations of shared domains can lead to different biological roles. In the last part of this study I developed and validate a system that contributes to sequence annotation by taking advantage of a validated transfer through inheritance procedure of the molecular functions and of the structural templates. After a cross-genome comparison with the BLAST program, clusters were built on the basis of two stringent constraints on sequence identity and coverage of the alignment. The adopted measure explicity answers to the problem of multi-domain proteins annotation and allows a fine grain division of the whole set of proteomes used, that ensures cluster homogeneity in terms of sequence length. A high level of coverage of structure templates on the length of protein sequences within clusters ensures that multi-domain proteins when present can be templates for sequences of similar length. This annotation procedure includes the possibility of reliably transferring statistically validated functions and structures to sequences considering information available in the present data bases of molecular functions and structures.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Methods of structural and statistical analysis of the relation between the sequence and secondary and three-dimensional structures are developed. About 5000 secondary structures of immunoglobulin molecules from the Kabat data base were predicted. Two statistical analyses of amino acids reveal 47 universal positions in strands and loops. Eight universally conservative positions out of the 47 are singled out because they contain the same amino acid in > 90% of all chains. The remaining 39 positions, which we term universally alternative positions, were divided into five groups: hydrophobic, charged and polar, aromatic, hydrophilic, and Gly-Ala, corresponding to the residues that occupied them in almost all chains. The analysis of residue-residue contacts shows that the 47 universal positions can be distinguished by the number and types of contacts. The calculations of contact maps in the 29 antibody structures revealed that residues in 24 of these 47 positions have contacts only with residues of antiparallel beta-strands in the same beta-sheet and residues in the remaining 23 positions always have far-away contacts with residues from other beta-sheets as well. In addition, residues in 6 of the 47 universal positions are also involved in interactions with residues of the other variable or constant domains.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We describe a new method for using neural networks to predict residue contact pairs in a protein. The main inputs to the neural network are a set of 25 measures of correlated mutation between all pairs of residues in two windows of size 5 centered on the residues of interest. While the individual pair-wise correlations are a relatively weak predictor of contact, by training the network on windows of correlation the accuracy of prediction is significantly improved. The neural network is trained on a set of 100 proteins and then tested on a disjoint set of 1033 proteins of known structure. An average predictive accuracy of 21.7% is obtained taking the best L/2 predictions for each protein, where L is the sequence length. Taking the best L/10 predictions gives an average accuracy of 30.7%. The predictor is also tested on a set of 59 proteins from the CASP5 experiment. The accuracy is found to be relatively consistent across different sequence lengths, but to vary widely according to the secondary structure. Predictive accuracy is also found to improve by using multiple sequence alignments containing many sequences to calculate the correlations. (C) 2004 Wiley-Liss, Inc.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background The residue-wise contact order (RWCO) describes the sequence separations between the residues of interest and its contacting residues in a protein sequence. It is a new kind of one-dimensional protein structure that represents the extent of long-range contacts and is considered as a generalization of contact order. Together with secondary structure, accessible surface area, the B factor, and contact number, RWCO provides comprehensive and indispensable important information to reconstructing the protein three-dimensional structure from a set of one-dimensional structural properties. Accurately predicting RWCO values could have many important applications in protein three-dimensional structure prediction and protein folding rate prediction, and give deep insights into protein sequence-structure relationships. Results We developed a novel approach to predict residue-wise contact order values in proteins based on support vector regression (SVR), starting from primary amino acid sequences. We explored seven different sequence encoding schemes to examine their effects on the prediction performance, including local sequence in the form of PSI-BLAST profiles, local sequence plus amino acid composition, local sequence plus molecular weight, local sequence plus secondary structure predicted by PSIPRED, local sequence plus molecular weight and amino acid composition, local sequence plus molecular weight and predicted secondary structure, and local sequence plus molecular weight, amino acid composition and predicted secondary structure. When using local sequences with multiple sequence alignments in the form of PSI-BLAST profiles, we could predict the RWCO distribution with a Pearson correlation coefficient (CC) between the predicted and observed RWCO values of 0.55, and root mean square error (RMSE) of 0.82, based on a well-defined dataset with 680 protein sequences. Moreover, by incorporating global features such as molecular weight and amino acid composition we could further improve the prediction performance with the CC to 0.57 and an RMSE of 0.79. In addition, combining the predicted secondary structure by PSIPRED was found to significantly improve the prediction performance and could yield the best prediction accuracy with a CC of 0.60 and RMSE of 0.78, which provided at least comparable performance compared with the other existing methods. Conclusion The SVR method shows a prediction performance competitive with or at least comparable to the previously developed linear regression-based methods for predicting RWCO values. In contrast to support vector classification (SVC), SVR is very good at estimating the raw value profiles of the samples. The successful application of the SVR approach in this study reinforces the fact that support vector regression is a powerful tool in extracting the protein sequence-structure relationship and in estimating the protein structural profiles from amino acid sequences.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

W5.43(194), a conserved tryptophan residue among G-protein coupled receptors (GPCRs) and cannabinoid receptors (CB), was examined in the present report for its significance in CB2 receptor ligand binding and adenylyl cyclase (AC) activity. Computer modeling postulates that this site in CB2 may be involved in the affinity of WIN55212-2 and SR144528 through aromatic contacts. In the present study, we reported that a CB2 receptor mutant, W5.43(194)Y, which had a tyrosine (Y) substitution for tryptophan (W), retained the binding affinity for CB agonist CP55940, but reduced binding affinity for CB2 agonist WIN55212-2 and inverse agonist SR144528 by 8-fold and 5-fold, respectively; the CB2 W5.43(194)F and W5.43(194)A mutations significantly affect the binding activities of CP55940, WIN55212-2 and SR144528. Furthermore, we found that agonist-mediated inhibition of the forskolin-induced cAMP production was dramatically diminished in the CB2 mutant W5.43(194)Y, whereas W5.43(194)F and W5.43(194)A mutants resulted in complete elimination of downstream signaling, suggesting that W5.43(194) was essential for the full activation of CB2. These results indicate that both aromatic interaction and hydrogen bonding are involved in ligand binding for the residue W5.43(194), and the mutations of this tryptophan site may affect the conformation of the ligand binding pocket and therefore control the active conformation of the wild type CB2 receptor. W5.43(194)Y/F/A mutations also displayed noticeable enhancement of the constitutive activation probably attributed to the receptor conformational changes resulted from the mutations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A coarse-grained model for protein-folding dynamics is introduced based on a discretized representation of torsional modes. The model, based on the Ramachandran map of the local torsional potential surface and the class (hydrophobic/polar/neutral) of each residue, recognizes patterns of both torsional conformations and hydrophobic-polar contacts, with tolerance for imperfect patterns. It incorporates empirical rates for formation of secondary and tertiary structure. The method yields a topological representation of the evolving local torsional configuration of the folding protein, modulo the basins of the Ramachandran map. The folding process is modeled as a sequence of transitions from one contact pattern to another, as the torsional patterns evolve. We test the model by applying it to the folding process of bovine pancreatic trypsin inhibitor, obtaining a kinetic description of the transitions between the contact patterns visited by the protein along the dominant folding pathway. The kinetics and detailed balance make it possible to invert the result to obtain a coarse topographic description of the potential energy surface along the dominant folding pathway, in effect to go backward or forward between a topological representation of the chain conformation and a topographical description of the potential energy surface governing the folding process. As a result, the strong structure-seeking character of bovine pancreatic trypsin inhibitor and the principal features of its folding pathway are reproduced in a reasonably quantitative way.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: The residue-wise contact order (RWCO) describes the sequence separations between the residues of interest and its contacting residues in a protein sequence. It is a new kind of one-dimensional protein structure that represents the extent of long-range contacts and is considered as a generalization of contact order. Together with secondary structure, accessible surface area, the B factor, and contact number, RWCO provides comprehensive and indispensable important information to reconstructing the protein three-dimensional structure from a set of one-dimensional structural properties. Accurately predicting RWCO values could have many important applications in protein three-dimensional structure prediction and protein folding rate prediction, and give deep insights into protein sequence-structure relationships. Results: We developed a novel approach to predict residue-wise contact order values in proteins based on support vector regression (SVR), starting from primary amino acid sequences. We explored seven different sequence encoding schemes to examine their effects on the prediction performance, including local sequence in the form of PSI-BLAST profiles, local sequence plus amino acid composition, local sequence plus molecular weight, local sequence plus secondary structure predicted by PSIPRED, local sequence plus molecular weight and amino acid composition, local sequence plus molecular weight and predicted secondary structure, and local sequence plus molecular weight, amino acid composition and predicted secondary structure. When using local sequences with multiple sequence alignments in the form of PSI-BLAST profiles, we could predict the RWCO distribution with a Pearson correlation coefficient (CC) between the predicted and observed RWCO values of 0.55, and root mean square error (RMSE) of 0.82, based on a well-defined dataset with 680 protein sequences. Moreover, by incorporating global features such as molecular weight and amino acid composition we could further improve the prediction performance with the CC to 0.57 and an RMSE of 0.79. In addition, combining the predicted secondary structure by PSIPRED was found to significantly improve the prediction performance and could yield the best prediction accuracy with a CC of 0.60 and RMSE of 0.78, which provided at least comparable performance compared with the other existing methods. Conclusion: The SVR method shows a prediction performance competitive with or at least comparable to the previously developed linear regression-based methods for predicting RWCO values. In contrast to support vector classification (SVC), SVR is very good at estimating the raw value profiles of the samples. The successful application of the SVR approach in this study reinforces the fact that support vector regression is a powerful tool in extracting the protein sequence-structure relationship and in estimating the protein structural profiles from amino acid sequences.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The aim of this work was to construct short analogues of the repetitive water-binding domain of the Pseudomonas syringae ice nucleation protein, InaZ. Structural analysis of these analogues might provide data pertaining to the protein-water contacts that underlie ice nucleation. An artificial gene coding for a 48-mer repeat sequence from InaZ was synthesized from four oligodeoxyribonucleotides and ligated into the expression vector, pGEX2T. The recombinant vector was cloned in Escherichia coli and a glutathione S-transferase fusion protein obtained. This fusion protein displayed a low level of ice-nucleating activity when tested by a droplet freezing assay. The fusion protein could be cleaved with thrombin, providing a means for future recovery of the 48-mer peptide in amounts suitable for structural analysis by nuclear magnetic resonance spectroscopy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A simple mimetic of a heparan sulfate disaccharide sequence that binds to the growth factors FGF-1 and FGF-2 was synthesized by coupling a 2-azido-2-deoxy-D-glucosyl trichloroacetimidate donor with a 1,6-anhydro-2-azido-2-deoxy--D-glucose acceptor. Both the donor and acceptor were obtained from a common intermediate readily obtained from D-glucal. Molecular docking calculations showed that the predicted locations of the disaccharide sulfo groups in the binding site of FGF-1 and FGF-2 are similar to the positions observed for co-crystallized heparin-derived oligosaccharides obtained from published crystal structures.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Boards of directors are thought to provide access to a wealth of knowledge and resources for the companies they serve, and are considered important to corporate governance. Under the Resource Based View (RBV) of the firm (Wernerfelt, 1984) boards are viewed as a strategic resource available to firms. As a consequence there has been a significant research effort aimed at establishing a link between board attributes and company performance. In this thesis I explore and extend the study of interlocking directorships (Mizruchi, 1996; Scott 1991a) by examining the links between directors’ opportunity networks and firm performance. Specifically, I use resource dependence theory (Pfeffer & Salancik, 1978) and social capital theory (Burt, 1980b; Coleman, 1988) as the basis for a new measure of a board’s opportunity network. I contend that both directors’ formal company ties and their social ties determine a director’s opportunity network through which they are able to access and mobilise resources for their firms. This approach is based on recent studies that suggest the measurement of interlocks at the director level, rather than at the firm level, may be a more reliable indicator of this phenomenon. This research uses publicly available data drawn from Australia’s top-105 listed companies and their directors in 1999. I employ Social Network Analysis (SNA) (Scott, 1991b) using the UCINET software to analyse the individual director’s formal and social networks. SNA is used to measure a the number of ties a director has to other directors in the top-105 company director network at both one and two degrees of separation, that is, direct ties and indirect (or ‘friend of a friend’) ties. These individual measures of director connectedness are aggregated to produce a board-level network metric for comparison with measures of a firm’s performance using multiple regression analysis. Performance is measured with accounting-based and market-based measures. Findings indicate that better-connected boards are associated with higher market-based company performance (measured by Tobin’s q). However, weaker and mostly unreliable associations were found for accounting-based performance measure ROA. Furthermore, formal (or corporate) network ties are a stronger predictor of market performance than total network ties (comprising social and corporate ties). Similarly, strong ties (connectedness at degree-1) are better predictors of performance than weak ties (connectedness at degree-2). My research makes four contributions to the literature on director interlocks. First, it extends a new way of measuring a board’s opportunity network based on the director rather than the company as the unit of interlock. Second, it establishes evidence of a relationship between market-based measures of firm performance and the connectedness of that firm’s board. Third, it establishes that director’s formal corporate ties matter more to market-based firm performance than their social ties. Fourth, it establishes that director’s strong direct ties are more important to market-based performance than weak ties. The thesis concludes with implications for research and practice, including a more speculative interpretation of these results. In particular, I raise the possibility of reverse causality – that is networked directors seek to join high-performing companies. Thus, the relationship may be a result of symbolic action by companies seeking to increase the legitimacy of their firms rather than a reflection of the social capital available to the companies. This is an important consideration worthy of future investigation.