990 resultados para PROTEIN FAMILIES
Resumo:
Background: Development of sensitive sequence search procedures for the detection of distant relationships between proteins at superfamily/fold level is still a big challenge. The intermediate sequence search approach is the most frequently employed manner of identifying remote homologues effectively. In this study, examination of serine proteases of prolyl oligopeptidase, rhomboid and subtilisin protein families were carried out using plant serine proteases as queries from two genomes including A. thaliana and O. sativa and 13 other families of unrelated folds to identify the distant homologues which could not be obtained using PSI-BLAST. Methodology/Principal Findings: We have proposed to start with multiple queries of classical serine protease members to identify remote homologues in families, using a rigorous approach like Cascade PSI-BLAST. We found that classical sequence based approaches, like PSI-BLAST, showed very low sequence coverage in identifying plant serine proteases. The algorithm was applied on enriched sequence database of homologous domains and we obtained overall average coverage of 88% at family, 77% at superfamily or fold level along with specificity of similar to 100% and Mathew's correlation coefficient of 0.91. Similar approach was also implemented on 13 other protein families representing every structural class in SCOP database. Further investigation with statistical tests, like jackknifing, helped us to better understand the influence of neighbouring protein families. Conclusions/Significance: Our study suggests that employment of multiple queries of a family for the Cascade PSI-BLAST searches is useful for predicting distant relationships effectively even at superfamily level. We have proposed a generalized strategy to cover all the distant members of a particular family using multiple query sequences. Our findings reveal that prior selection of sequences as query and the presence of neighbouring families can be important for covering the search space effectively in minimal computational time. This study also provides an understanding of the `bridging' role of related families.
Resumo:
Dense core granules (DCGs) in Tetrahymena thermophila contain two protein classes. Proteins in the first class, called granule lattice (Grl), coassemble to form a crystalline lattice within the granule lumen. Lattice expansion acts as a propulsive mechanism during DCG release, and Grl proteins are essential for efficient exocytosis. The second protein class, defined by a C-terminal beta/gamma-crystallin domain, is poorly understood. Here, we have analyzed the function and sorting of Grt1p (granule tip), which was previously identified as an abundant protein in this family. Cells lacking all copies of GRT1, together with the closely related GRT2, accumulate wild-type levels of docked DCGs. Unlike cells disrupted in any of the major GRL genes, Delta GRT1 Delta GRT2 cells show no defect in secretion, indicating that neither exocytic fusion nor core expansion depends on GRT1. These results suggest that Grl protein sorting to DCGs is independent of Grt proteins. Consistent with this, the granule core lattice in Delta GRT1 Delta GRT2 cells appears identical to that in wild-type cells by electron microscopy, and the only biochemical component visibly absent is Grt1p itself. Moreover, gel filtration showed that Grl and Grt proteins in cell homogenates exist in nonoverlapping complexes, and affinity-isolated Grt1p complexes do not contain Grl proteins. These data demonstrate that two major classes of proteins in Tetrahymena DCGs are likely to be independently transported during DCG biosynthesis and play distinct roles in granule function. The role of Grt1p may primarily be postexocytic; consistent with this idea, DCG contents from Delta GRT1 Delta GRT2 cells appear less adhesive than those from the wild type.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Signature databases are vital tools for identifying distant relationships in novel sequences and hence for inferring protein function. InterPro is an integrated documentation resource for protein families, domains and functional sites, which amalgamates the efforts of the PROSITE, PRINTS, Pfam and ProDom database projects. Each InterPro entry includes a functional description, annotation, literature references and links back to the relevant member database(s). Release 2.0 of InterPro (October 2000) contains over 3000 entries, representing families, domains, repeats and sites of post-translational modification encoded by a total of 6804 different regular expressions, profiles, fingerprints and Hidden Markov Models. Each InterPro entry lists all the matches against SWISS-PROT and TrEMBL (more than 1 000 000 hits from 462 500 proteins in SWISS-PROT and TrEMBL). The database is accessible for text- and sequence-based searches at http://www.ebi.ac.uk/interpro/. Questions can be emailed to interhelp@ebi.ac.uk.
Resumo:
Thyrotropin is the primary hormone that, via one heptahelical receptor, regulates thyroid cell functions such as secretion, specific gene expression, and growth. In human thyroid, thyrotropin receptor activation leads to stimulation of the adenylyl cyclase and phospholipase C cascades. However, the G proteins involved in thyrotropin receptor action have been only partially defined. In membranes of human thyroid gland, we immunologically identified alpha subunits of the G proteins Gs short, Gs long, Gi1, Gi2, Gi3, G(o) (Go2 and another form of Go, presumably Go1), Gq, G11, G12, and G13. Activation of the thyrotropin (TSH) receptor by bovine TSH led to increased incorporation of the photoreactive GTP analogue [alpha-32P]GTP azidoanilide into immunoprecipitated alpha subunits of all G proteins detected in thyroid membranes. This effect was receptor-dependent and not due to direct G protein stimulation because it was mimicked by TSH receptor-stimulating antibodies of patients suffering from Grave disease and was abolished by a receptor-blocking antiserum from a patient with autoimmune hypothyroidism. The TSH-induced activation of individual G proteins occurred with EC50 values of 5-50 milliunits/ml, indicating that the activated TSH receptor coupled with similar potency to different G proteins. When human thyroid slices were pretreated with pertussis toxin, the TSH receptor-mediated accumulation of cAMP increased by approximately 35% with TSH at 1 milliunits/ml, indicating that the TSH receptor coupled to Gs and G(i). Taken together, these findings show that, at least in human thyroid membranes, in which the protein is expressed at its physiological levels, the TSH receptor resembles a naturally occurring example of a general G protein-activating receptor.
Resumo:
Background: Disulphide bridges are well known to play key roles in stability, folding and functions of proteins. Introduction or deletion of disulphides by site-directed mutagenesis have produced varying effects on stability and folding depending upon the protein and location of disulphide in the 3-D structure. Given the lack of complete understanding it is worthwhile to learn from an analysis of extent of conservation of disulphides in homologous proteins. We have also addressed the question of what structural interactions replaces a disulphide in a homologue in another homologue. Results: Using a dataset involving 34,752 pairwise comparisons of homologous protein domains corresponding to 300 protein domain families of known 3-D structures, we provide a comprehensive analysis of extent of conservation of disulphide bridges and their structural features. We report that only 54% of all the disulphide bonds compared between the homologous pairs are conserved, even if, a small fraction of the non-conserved disulphides do include cytoplasmic proteins. Also, only about one fourth of the distinct disulphides are conserved in all the members in protein families. We note that while conservation of disulphide is common in many families, disulphide bond mutations are quite prevalent. Interestingly, we note that there is no clear relationship between sequence identity between two homologous proteins and disulphide bond conservation. Our analysis on structural features at the sites where cysteines forming disulphide in one homologue are replaced by non-Cys residues show that the elimination of a disulphide in a homologue need not always result in stabilizing interactions between equivalent residues. Conclusion: We observe that in the homologous proteins, disulphide bonds are conserved only to a modest extent. Very interestingly, we note that extent of conservation of disulphide in homologous proteins is unrelated to the overall sequence identity between homologues. The non-conserved disulphides are often associated with variable structural features that were recruited to be associated with differentiation or specialisation of protein function.
Resumo:
Structural similarity among proteins is reflected in the distribution of hydropathicity along the amino acids in the protein sequence. Similarities in the hydropathy distributions are obvious for homologous proteins within a protein family. They also were observed for proteins with related structures, even when sequence similarities were undetectable. Here we present a novel method that employs the hydropathy distribution in proteins for identification of (sub)families in a set of (homologous) proteins. We represent proteins as points in a generalized hydropathy space, represented by vectors of specifically defined features. The features are derived from hydropathy of the individual amino acids. Projection of this space onto principal axes reveals groups of proteins with related hydropathy distributions. The groups identified correspond well to families of structurally and functionally related proteins. We found that this method accurately identifies protein families in a set of proteins, or subfamilies in a set of homologous proteins. Our results show that protein families can be identified by the analysis of hydropathy distribution, without the need for sequence alignment. (C) 2005 Wiley-Liss, Inc.
Resumo:
Autoimmune diseases are a major health problem. Usually autoimmune disorders are multifactorial and their pathogenesis involves a combination of predisposing variations in the genome and other factors such as environmental triggers. APECED (autoimmune polyendocrinopathy-candidiasis-ectodermal dystrophy) is a rare, recessively inherited, autoimmune disease caused by mutations in a single gene. Patients with APECED suffer from several organ-specific autoimmune disorders, often affecting the endocrine glands. The defective gene, AIRE, codes for a transcriptional regulator. The AIRE (autoimmune regulator) protein controls the expression of hundreds of genes, representing a substantial subset of tissue-specific antigens which are presented to developing T cells in the thymus and has proven to be a key molecule in the establishment of immunological tolerance. However, the molecular mechanisms by which AIRE mediates its functions are still largely obscure. The aim of this thesis has been to elucidate the functions of AIRE by studying the molecular interactions it is involved in by utilizing different cultured cell models. A potential molecular mechanism for exceptional, dominant, inheritance of APECED in one family, carrying a glycine 228 to tryptophan (G228W) mutation, was described in this thesis. It was shown that the AIRE polypeptide with G228W mutation has a dominant negative effect by binding the wild type AIRE and inhibiting its transactivation capacity in vitro. The data also emphasizes the importance of homomultimerization of AIRE in vivo. Furthermore, two novel protein families interacting with AIRE were identified. The importin alpha molecules regulate the nuclear import of AIRE by binding to the nuclear localization signal of AIRE, delineated as a classical monopartite signal sequence. The interaction of AIRE with PIAS E3 SUMO ligases, indicates a link to the sumoylation pathway, which plays an important role in the regulation of nuclear architecture. It was shown that AIRE is not a target for SUMO modification but enhances the localization of SUMO1 and PIAS1 proteins to nuclear bodies. Additional support for the suggestion that AIRE would preferably up-regulate genes with tissue-specific expression pattern and down-regulate housekeeping genes was obtained from transactivation studies performed with two models: human insulin and cystatin B promoters. Furthermore, AIRE and PIAS activate the insulin promoter concurrently in a transactivation assay, indicating that their interaction is biologically relevant. Identification of novel interaction partners for AIRE provides us information about the molecular pathways involved in the establishment of immunological tolerance and deepens our understanding of the role played by AIRE not only in APECED but possibly also in several other autoimmune diseases.
Resumo:
Filamentous fungi of the subphylum Pezizomycotina are well known as protein and secondary metabolite producers. Various industries take advantage of these capabilities. However, the molecular biology of yeasts, i.e. Saccharomycotina and especially that of Saccharomyces cerevisiae, the baker's yeast, is much better known. In an effort to explain fungal phenotypes through their genotypes we have compared protein coding gene contents of Pezizomycotina and Saccharomycotina. Only biomass degradation and secondary metabolism related protein families seem to have expanded recently in Pezizomycotina. Of the protein families clearly diverged between Pezizomycotina and Saccharomycotina, those related to mitochondrial functions emerge as the most prominent. However, the primary metabolism as described in S. cerevisiae is largely conserved in all fungi. Apart from the known secondary metabolism, Pezizomycotina have pathways that could link secondary metabolism to primary metabolism and a wealth of undescribed enzymes. Previous studies of individual Pezizomycotina genomes have shown that regardless of the difference in production efficiency and diversity of secreted proteins, the content of the known secretion machinery genes in Pezizomycotina and Saccharomycotina appears very similar. Genome wide analysis of gene products is therefore needed to better understand the efficient secretion of Pezizomycotina. We have developed methods applicable to transcriptome analysis of non-sequenced organisms. TRAC (Transcriptional profiling with the aid of affinity capture) has been previously developed at VTT for fast, focused transcription analysis. We introduce a version of TRAC that allows more powerful signal amplification and multiplexing. We also present computational optimisations of transcriptome analysis of non-sequenced organism and TRAC analysis in general. Trichoderma reesei is one of the most commonly used Pezizomycotina in the protein production industry. In order to understand its secretion system better and find clues for improvement of its industrial performance, we have analysed its transcriptomic response to protein secretion stress conditions. In comparison to S. cerevisiae, the response of T. reesei appears different, but still impacts on the same cellular functions. We also discovered in T. reesei interesting similarities to mammalian protein secretion stress response. Together these findings highlight targets for more detailed studies.
Resumo:
We explore the fuse of information on co-occurrence of domains in multi-domain proteins in predicting protein-protein interactions. The basic premise of our work is the assumption that domains co-occurring in a polypeptide chain undergo either structural or functional interactions among themselves. In this study we use a template dataset of domains in multidomain proteins and predict protein-protein interactions in a target organism. We note that maximum number of correct predictions of interacting protein domain families (158) is made in S. cerevisiae when the dataset of closely related organisms is used as the template followed by the more diverse dataset of bacterial proteins (48) and a dataset of randomly chosen proteins (23). We conclude that use of multi-domain information from organisms closely-related to the target can aid prediction of interacting protein families.
Resumo:
Protein functional annotation relies on the identification of accurate relationships, sequence divergence being a key factor. This is especially evident when distant protein relationships are demonstrated only with three-dimensional structures. To address this challenge, we describe a computational approach to purposefully bridge gaps between related protein families through directed design of protein-like ``linker'' sequences. For this, we represented SCOP domain families, integrated with sequence homologues, as multiple profiles and performed HMM-HMM alignments between related domain families. Where convincing alignments were achieved, we applied a roulette wheel-based method to design 3,611,010 protein-like sequences corresponding to 374 SCOP folds. To analyze their ability to link proteins in homology searches, we used 3024 queries to search two databases, one containing only natural sequences and another one additionally containing designed sequences. Our results showed that augmented database searches showed up to 30% improvement in fold coverage for over 74% of the folds, with 52 folds achieving all theoretically possible connections. Although sequences could not be designed between some families, the availability of designed sequences between other families within the fold established the sequence continuum to demonstrate 373 difficult relationships. Ultimately, as a practical and realistic extension, we demonstrate that such protein-like sequences can be ``plugged-into'' routine and generic sequence database searches to empower not only remote homology detection but also fold recognition. Our richly statistically supported findings show that complementary searches in both databases will increase the effectiveness of sequence-based searches in recognizing all homologues sharing a common fold. (C) 2013 Elsevier Ltd. All rights reserved.
Resumo:
NrichD
Resumo:
Protein tyrosine phosphatases (PTPs) are comprised of two superfamilies, the phosphatase I superfamily containing a single low-molecular-weight PTP (lmwPTP) family and the phosphatase II superfamily including both the higher-molecular-weight PTP (hmwPTP) and the dual-specificity phosphatase (DSP) families. The phosphatase I and H superfamilies are often considered to be the result of convergent evolution. The PTP sequence and structure analyses indicate that lmwPTPs, hmwPTPs, and DSPs share similar structures, functions, and a common signature motif, although they have low sequence identities and a different order of active sites in sequence or a circular permutation. The results of this work suggest that lmwPTPs and hmwPTPs/DSPs are remotely related in evolution. The earliest ancestral gene of PTPs could be from a short fragment containing about 90similar to120 nucleotides or 30similar to40 residues; however, a probable full PTP ancestral gene contained one transcript unit with two lmwPTP genes. All three PTP families may have resulted from a common ancestral gene by a series of duplications, fusions, and circular permutations. The circular permutation in PTPs is caused by a reading frame difference, which is similar to that in DNA methyltransferases. Nevertheless, the evolutionary mechanism of circular permutation in PTP genes seems to be more complicated than that in DNA methyltransferase genes. Both mechanisms in PTPs and DNA methyltransferases can be used to explain how some protein families and superfamilies came to be formed by circular permutations during molecular evolution.