961 resultados para Protein structural classes
Resumo:
Dissertação apresentada para obtenção de Grau de Doutor em Bioquímica,Bioquímica Estrutural, pela Universidade Nova de Lisboa, Faculdade de Ciências e Tecnologia
DPS-Like Peroxide Resistance Protein: Structural and Functional Studies on a Versatile Nanocontainer
Resumo:
Oxidative stress is a constant threat to almost all organisms. It damages a number of biomolecules and leads to the disruption of many crucial cellular functions. It is caused by reactive oxygen species (ROS), such as hydrogen peroxide (H
Resumo:
The reliable assessment of the quality of protein structural models is fundamental to the progress of structural bioinformatics. The ModFOLD server provides access to two accurate techniques for the global and local prediction of the quality of 3D models of proteins. Firstly ModFOLD, which is a fast Model Quality Assessment Program (MQAP) used for the global assessment of either single or multiple models. Secondly ModFOLDclust, which is a more intensive method that carries out clustering of multiple models and provides per-residue local quality assessment.
Resumo:
Nowadays, classifying proteins in structural classes, which concerns the inference of patterns in their 3D conformation, is one of the most important open problems in Molecular Biology. The main reason for this is that the function of a protein is intrinsically related to its spatial conformation. However, such conformations are very difficult to be obtained experimentally in laboratory. Thus, this problem has drawn the attention of many researchers in Bioinformatics. Considering the great difference between the number of protein sequences already known and the number of three-dimensional structures determined experimentally, the demand of automated techniques for structural classification of proteins is very high. In this context, computational tools, especially Machine Learning (ML) techniques, have become essential to deal with this problem. In this work, ML techniques are used in the recognition of protein structural classes: Decision Trees, k-Nearest Neighbor, Naive Bayes, Support Vector Machine and Neural Networks. These methods have been chosen because they represent different paradigms of learning and have been widely used in the Bioinfornmatics literature. Aiming to obtain an improvment in the performance of these techniques (individual classifiers), homogeneous (Bagging and Boosting) and heterogeneous (Voting, Stacking and StackingC) multiclassification systems are used. Moreover, since the protein database used in this work presents the problem of imbalanced classes, artificial techniques for class balance (Undersampling Random, Tomek Links, CNN, NCL and OSS) are used to minimize such a problem. In order to evaluate the ML methods, a cross-validation procedure is applied, where the accuracy of the classifiers is measured using the mean of classification error rate, on independent test sets. These means are compared, two by two, by the hypothesis test aiming to evaluate if there is, statistically, a significant difference between them. With respect to the results obtained with the individual classifiers, Support Vector Machine presented the best accuracy. In terms of the multi-classification systems (homogeneous and heterogeneous), they showed, in general, a superior or similar performance when compared to the one achieved by the individual classifiers used - especially Boosting with Decision Tree and the StackingC with Linear Regression as meta classifier. The Voting method, despite of its simplicity, has shown to be adequate for solving the problem presented in this work. The techniques for class balance, on the other hand, have not produced a significant improvement in the global classification error. Nevertheless, the use of such techniques did improve the classification error for the minority class. In this context, the NCL technique has shown to be more appropriated
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Evidence is growing to support a functional role for the prion protein (PrP) in copper metabolism. Copper ions appear to bind to the protein in a highly conserved octapeptide repeat region (sequence PHGGGWGQ) near the N terminus. To delineate the site and mode of binding of Cu(II) to the PrP, the copper-binding properties of peptides of varying lengths corresponding to 2-, 3-, and 4-octarepeat sequences have been probed by using various spectroscopic techniques. A two-octarepeat peptide binds a single Cu(II) ion with Kd ≈ 6 μM whereas a four-octarepeat peptide cooperatively binds four Cu(II) ions. Circular dichroism spectra indicate a distinctive structuring of the octarepeat region on Cu(II) binding. Visible absorption, visible circular dichroism, and electron spin resonance spectra suggest that the coordination sphere of the copper is identical for 2, 3, or 4 octarepeats, consisting of a square-planar geometry with three nitrogen ligands and one oxygen ligand. Consistent with the pH dependence of Cu(II) binding, proton NMR spectroscopy indicates that the histidine residues in each octarepeat are coordinated to the Cu(II) ion. Our working model for the structure of the complex shows the histidine residues in successive octarepeats bridged between two copper ions, with both the Nɛ2 and Nδ1 imidazole nitrogen of each histidine residue coordinated and the remaining coordination sites occupied by a backbone amide nitrogen and a water molecule. This arrangement accounts for the cooperative nature of complex formation and for the apparent evolutionary requirement for four octarepeats in the PrP.
Resumo:
Foldons, which are kinetically competent, quasi-independently folding units of a protein, may be defined using energy landscape analysis. Foldons can be identified by maxima in a scan of the ratio of a contiguous segment's energetic stability gap to the energy variance of that segment's molten globule states, reflecting the requirement of minimal frustration. The predicted foldons are compared with the exons and structural modules for 16 of the 30 proteins studied. Statistical analysis indicates a strong correlation between the energetically determined foldons and Go's geometrically defined structural modules, but there are marked sequence-dependent effects. There is only a weak correlation of foldons to exons. For gammaII-crystallin, myoglobin, barnase, alpha-lactalbumin, and cytochrome c the foldons and some noncontiguous clusters of foldons compare well with intermediates observed in experiment.
Resumo:
Membrane protein structural biology is critically dependent upon the supply of high-quality protein. Over the last few years, the value of crystallising biochemically characterised, recombinant targets that incorporate stabilising mutations has been established. Nonetheless, obtaining sufficient yields of many recombinant membrane proteins is still a major challenge. Solutions are now emerging based on an improved understanding of recombinant host cells; as a 'cell factory' each cell is tasked with managing limited resources to simultaneously balance its own growth demands with those imposed by an expression plasmid. This review examines emerging insights into the role of translation and protein folding in defining high-yielding recombinant membrane protein production in a range of host cells.
Resumo:
Membrane proteins account for a third of the eukaryotic proteome, but are greatly under-represented in the Protein Data Bank. Unfortunately, recent technological advances in X-ray crystallography and EM cannot account for the poor solubility and stability of membrane protein samples. A limitation of conventional detergent-based methods is that detergent molecules destabilize membrane proteins, leading to their aggregation. The use of orthologues, mutants and fusion tags has helped improve protein stability, but at the expense of not working with the sequence of interest. Novel detergents such as glucose neopentyl glycol (GNG), maltose neopentyl glycol (MNG) and calixarene-based detergents can improve protein stability without compromising their solubilizing properties. Styrene maleic acid lipid particles (SMALPs) focus on retaining the native lipid bilayer of a membrane protein during purification and biophysical analysis. Overcoming bottlenecks in the membrane protein structural biology pipeline, primarily by maintaining protein stability, will facilitate the elucidation of many more membrane protein structures in the near future.
Resumo:
In this article, we describe a novel methodology to extract semantic characteristics from protein structures using linear algebra in order to compose structural signature vectors which may be used efficiently to compare and classify protein structures into fold families. These signatures are built from the pattern of hydrophobic intrachain interactions using Singular Value Decomposition (SVD) and Latent Semantic Indexing (LSI) techniques. Considering proteins as documents and contacts as terms, we have built a retrieval system which is able to find conserved contacts in samples of myoglobin fold family and to retrieve these proteins among proteins of varied folds with precision of up to 80%. The classifier is a web tool available at our laboratory website. Users can search for similar chains from a specific PDB, view and compare their contact maps and browse their structures using a JMol plug-in.
Resumo:
The Schistosoma mansoni fatty acid binding protein (FABP), SmA, is a vaccine candidate against, S. mansoni and F hepatica. Previously, we demonstrated the importance of a correct fold to achieve protection in immunized animals after cercariae challenge [[10]. C.R.R. Ramos, R.C.R. Figueredo, T.A. Pertinhez, M.M. Vilar, A.L.T.O. Nascimento, M. Tendler, I. Raw, A. Spisni, P.L. Ho, Gene structure and M20T polymorphism of the Schistosoma mansoni Sm14 fatty acid-binding protein: structural, functional and immunoprotection analysis. J. Biol. Chem. 278 (2003) 12745-12751]. Here we show that the reduction of vaccine efficacy over time is due to protein dimerization and subsequent aggregation. We produced the mutants Sm14-M20(C62S) and Sm14M20(C62V) that, as expected, did not dimerize in SDS-PAGE. Molecular dynamics calculations and unfolding experiments highlighted a higher structural stability of these mutants with respect to the wild-type. In addition, we found that the mutated proteins, after thermal denaturation, refolded to their active native molecular architecture as proved by the recovery of the fatty acid binding ability. Sm14-M20(C62V) turned out to be the more stable form over time, providing the basis to determine the first 3D solution structure of a Sm14 protein in its apo-form. Overall, Sm14-M20(C62V) possesses an improved structural stability over time, an essential feature to preserve its immunization capability and, in experimentally immunized animals, it exhibits a protection effect against S. mansoni cercariae infections comparable to the one obtained with the wild-type protein. These facts indicate this protein as a good lead molecule for large-scale production and for developing an effective Sm14 based anti-helminthes vaccine. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
Background: In protein sequence classification, identification of the sequence motifs or n-grams that can precisely discriminate between classes is a more interesting scientific question than the classification itself. A number of classification methods aim at accurate classification but fail to explain which sequence features indeed contribute to the accuracy. We hypothesize that sequences in lower denominations (n-grams) can be used to explore the sequence landscape and to identify class-specific motifs that discriminate between classes during classification. Discriminative n-grams are short peptide sequences that are highly frequent in one class but are either minimally present or absent in other classes. In this study, we present a new substitution-based scoring function for identifying discriminative n-grams that are highly specific to a class. Results: We present a scoring function based on discriminative n-grams that can effectively discriminate between classes. The scoring function, initially, harvests the entire set of 4- to 8-grams from the protein sequences of different classes in the dataset. Similar n-grams of the same size are combined to form new n-grams, where the similarity is defined by positive amino acid substitution scores in the BLOSUM62 matrix. Substitution has resulted in a large increase in the number of discriminatory n-grams harvested. Due to the unbalanced nature of the dataset, the frequencies of the n-grams are normalized using a dampening factor, which gives more weightage to the n-grams that appear in fewer classes and vice-versa. After the n-grams are normalized, the scoring function identifies discriminative 4- to 8-grams for each class that are frequent enough to be above a selection threshold. By mapping these discriminative n-grams back to the protein sequences, we obtained contiguous n-grams that represent short class-specific motifs in protein sequences. Our method fared well compared to an existing motif finding method known as Wordspy. We have validated our enriched set of class-specific motifs against the functionally important motifs obtained from the NLSdb, Prosite and ELM databases. We demonstrate that this method is very generic; thus can be widely applied to detect class-specific motifs in many protein sequence classification tasks. Conclusion: The proposed scoring function and methodology is able to identify class-specific motifs using discriminative n-grams derived from the protein sequences. The implementation of amino acid substitution scores for similarity detection, and the dampening factor to normalize the unbalanced datasets have significant effect on the performance of the scoring function. Our multipronged validation tests demonstrate that this method can detect class-specific motifs from a wide variety of protein sequence classes with a potential application to detecting proteome-specific motifs of different organisms.