990 resultados para PROTEIN FAMILIES
Resumo:
Sequence motifs occurring in a particular order in proteins or DNA have been proved to be of biological interest. In this paper, a new method to locate the occurrences of up to five user-defined motifs in a specified order in large proteins and in nucleotide sequence databases is proposed. It has been designed using the concept of quantifiers in regular expressions and linked lists for data storage. The application of this method includes the extraction of relevant consensus regions from biological sequences. This might be useful in clustering of protein families as well as to study the correlation between positions of motifs and their functional sites in DNA sequences.
Resumo:
Protein structure comparison is essential for understanding various aspects of protein structure, function and evolution. It can be used to explore the structural diversity and evolutionary patterns of protein families. In view of the above, a new algorithm is proposed which performs faster protein structure comparison using the peptide backbone torsional angles. It is fast, robust, computationally less expensive and efficient in finding structural similarities between two different protein structures and is also capable of identifying structural repeats within the same protein molecule.
Resumo:
The availability of the genome sequence of Mycobacterium tuberculosis H37Rv has encouraged determination of large numbers of protein structures and detailed definition of the biological information encoded therein; yet, the functions of many proteins in M. tuberculosis remain unknown. The emergence of multidrug resistant strains makes it a priority to exploit recent advances in homology recognition and structure prediction to re-analyse its gene products. Here we report the structural and functional characterization of gene products encoded in the M. tuberculosis genome, with the help of sensitive profile-based remote homology search and fold recognition algorithms resulting in an enhanced annotation of the proteome where 95% of the M. tuberculosis proteins were identified wholly or partly with information on structure or function. New information includes association of 244 proteins with 205 domain families and a separate set of new association of folds to 64 proteins. Extending structural information across uncharacterized protein families represented in the M. tuberculosis proteome, by determining superfamily relationships between families of known and unknown structures, has contributed to an enhancement in the knowledge of structural content. In retrospect, such superfamily relationships have facilitated recognition of probable structure and/or function for several uncharacterized protein families, eventually aiding recognition of probable functions for homologous proteins corresponding to such families. Gene products unique to mycobacteria for which no functions could be identified are 183. Of these 18 were determined to be M. tuberculosis specific. Such pathogen-specific proteins are speculated to harbour virulence factors required for pathogenesis. A re-annotated proteome of M. tuberculosis, with greater completeness of annotated proteins and domain assigned regions, provides a valuable basis for experimental endeavours designed to obtain a better understanding of pathogenesis and to accelerate the process of drug target discovery. (C) 2014 Elsevier Ltd. All rights reserved.
Resumo:
Background: In the post-genomic era where sequences are being determined at a rapid rate, we are highly reliant on computational methods for their tentative biochemical characterization. The Pfam database currently contains 3,786 families corresponding to ``Domains of Unknown Function'' (DUF) or ``Uncharacterized Protein Family'' (UPF), of which 3,087 families have no reported three-dimensional structure, constituting almost one-fourth of the known protein families in search for both structure and function. Results: We applied a `computational structural genomics' approach using five state-of-the-art remote similarity detection methods to detect the relationship between uncharacterized DUFs and domain families of known structures. The association with a structural domain family could serve as a start point in elucidating the function of a DUF. Amongst these five methods, searches in SCOP-NrichD database have been applied for the first time. Predictions were classified into high, medium and low-confidence based on the consensus of results from various approaches and also annotated with enzyme and Gene ontology terms. 614 uncharacterized DUFs could be associated with a known structural domain, of which high confidence predictions, involving at least four methods, were made for 54 families. These structure-function relationships for the 614 DUF families can be accessed on-line at http://proline.biochem.iisc.ernet.in/RHD_DUFS/. For potential enzymes in this set, we assessed their compatibility with the associated fold and performed detailed structural and functional annotation by examining alignments and extent of conservation of functional residues. Detailed discussion is provided for interesting assignments for DUF3050, DUF1636, DUF1572, DUF2092 and DUF659. Conclusions: This study provides insights into the structure and potential function for nearly 20 % of the DUFs. Use of different computational approaches enables us to reliably recognize distant relationships, especially when they converge to a common assignment because the methods are often complementary. We observe that while pointers to the structural domain can offer the right clues to the function of a protein, recognition of its precise functional role is still `non-trivial' with many DUF domains conserving only some of the critical residues. It is not clear whether these are functional vestiges or instances involving alternate substrates and interacting partners. Reviewers: This article was reviewed by Drs Eugene Koonin, Frank Eisenhaber and Srikrishna Subramanian.
Resumo:
Lipocalins constitute a superfamily of extracellular proteins that are found in all three kingdoms of life. Although very divergent in their sequences and functions, they show remarkable similarity in 3-D structures. Lipocalins bind and transport small hydrophobic molecules. Earlier sequence-based phylogenetic studies of lipocalins highlighted that they have a long evolutionary history. However the molecular and structural basis of their functional diversity is not completely understood. The main objective of the present study is to understand functional diversity of the lipocalins using a structure-based phylogenetic approach. The present study with 39 protein domains from the lipocalin superfamily suggests that the clusters of lipocalins obtained by structure-based phylogeny correspond well with the functional diversity. The detailed analysis on each of the clusters and sub-clusters reveals that the 39 lipocalin domains cluster based on their mode of ligand binding though the clustering was performed on the basis of gross domain structure. The outliers in the phylogenetic tree are often from single member families. Also structure-based phylogenetic approach has provided pointers to assign putative function for the domains of unknown function in lipocalin family. The approach employed in the present study can be used in the future for the functional identification of new lipocalin proteins and may be extended to other protein families where members show poor sequence similarity but high structural similarity.
Resumo:
Proteolytic enzymes have evolved several mechanisms to cleave peptide bonds. These distinct types have been systematically categorized in the MEROPS database. While a BLAST search on these proteases identifies homologous proteins, sequence alignment methods often fail to identify relationships arising from convergent evolution, exon shuffling, and modular reuse of catalytic units. We have previously established a computational method to detect functions in proteins based on the spatial and electrostatic properties of the catalytic residues (CLASP). CLASP identified a promiscuous serine protease scaffold in alkaline phosphatases (AP) and a scaffold recognizing a beta-lactam (imipenem) in a cold-active Vibrio AP. Subsequently, we defined a methodology to quantify promiscuous activities in a wide range of proteins. Here, we assemble a module which encapsulates the multifarious motifs used by protease families listed in the MEROPS database. Since APs and proteases are an integral component of outer membrane vesicles (OMV), we sought to query other OMV proteins, like phospholipase C (PLC), using this search module. Our analysis indicated that phosphoinositide-specific PLC from Bacillus cereus is a serine protease. This was validated by protease assays, mass spectrometry and by inhibition of the native phospholipase activity of PI-PLC by the well-known serine protease inhibitor AEBSF (IC50 = 0.018 mM). Edman degradation analysis linked the specificity of the protease activity to a proline in the amino terminal, suggesting that the PI-PLC is a prolyl peptidase. Thus, we propose a computational method of extending protein families based on the spatial and electrostatic congruence of active site residues.
Resumo:
Karwath, A. King, R. Homology induction: the use of machine learning to improve sequence similarity searches. BMC Bioinformatics. 23rd April 2002. 3:11 Additional File Describes the title organims species declaration in one string [http://www.biomedcentral.com/content/supplementary/1471- 2105-3-11-S1.doc] Sponsorship: Andreas Karwath and Ross D. King were supported by the EPSRC grant GR/L62849.
Resumo:
Wydział Biologii: Instytut Biologii Molekularnej i Biotechnologii
Resumo:
Currently, the sole strategy for managing food hypersensitivity involves strict avoidance of the trigger. Several alternate strategies for the treatment of food allergies are currently under study. Also being explored is the process of eliminating allergenic proteins from crop plants. Legumes are a rich source of protein and are an essential component of the human diet. Unfortunately, legumes, including soybean and peanut, are also common sources of food allergens. Four protein families and superfamilies account for the majority of legume allergens, which include storage proteins of seeds (cupins and prolamins), profilins, and the larger group of pathogenesis-related proteins. Two strategies have been used to produce hypoallergenic legume crops: (1) germplasm lines are screened for the absence or reduced content of specific allergenic proteins and (2) genetic transformation is used to silence native genes encoding allergenic proteins. Both approaches have been successful in producing cultivars of soybeans and peanuts with reduced allergenic proteins. However, it is unknown whether the cultivars are actually hypoallergenic to those with sensitivity. This review describes efforts to produce hypoallergenic cultivars of soybean and peanut and discusses the challenges that need to be overcome before such products could be available in the marketplace.
Resumo:
Complex animals use a wide variety of adaptor proteins to produce specialized sites of interaction between actin and membranes. Plants do not have these protein families, yet actin-membrane interactions within plant cells are critical for the positioning of subcellular compartments, for coordinating intercellular communication, and for membrane deformation [1]. Novel factors are therefore likely to provide interfaces at actin-membrane contacts in plants, but their identity has remained obscure. Here we identify the plantspecific Networked (NET) superfamily of actin-binding proteins, members of which localize to the actin cytoskeleton and specify different membrane compartments. The founding member of the NET superfamily, NET1A, is anchored at the plasma membrane and predominates at cell junctions, the plasmodesmata. NET1A binds directly to actin filaments via a novel actin-binding domain that defines a superfamily of thirteen Arabidopsis proteins divided into four distinct phylogenetic clades. Members of other clades identify interactions at the tonoplast, nuclear membrane, and pollen tube plasma membrane, emphasizing the role of this superfamily in mediating actin-membrane interactions.
Resumo:
The rationale for identifying drug targets within helminth neuromuscular signalling systems is based on the premise that adequate nerve and muscle function is essential for many of the key behavioural determinants of helminth parasitism, including sensory perception/host location, invasion, locomotion/orientation, attachment, feeding and reproduction. This premise is validated by the tendency of current anthelmintics to act on classical neurotransmitter-gated ion channels present on helminth nerve and/or muscle, yielding therapeutic endpoints associated with paralysis and/or death. Supplementary to classical neurotransmitters, helminth nervous systems are peptide-rich and encompass associated biosynthetic and signal transduction components - putative drug targets that remain to be exploited by anthelmintic chemotherapy. At this time, no neuropeptide system-targeting lead compounds have been reported, and given that our basic knowledge of neuropeptide biology in parasitic helminths remains inadequate, the short-term prospects for such drugs remain poor. Here, we review current knowledge of neuropeptide signalling in Nematoda and Platyhelminthes, and highlight a suite of 19 protein families that yield deleterious phenotypes in helminth reverse genetics screens. We suggest that orthologues of some of these peptidergic signalling components represent appealing therapeutic targets in parasitic helminths.
Resumo:
The BAR (Bin/amphiphysin/Rvs) domain is the most conserved feature in amphiphysins from yeast to human and is also found in endophilins and nadrins. We solved the structure of the Drosophila amphiphysin BAR domain. It is a crescent-shaped dimer that binds preferentially to highly curved negatively charged membranes. With its N-terminal amphipathic helix and BAR domain (N-BAR), amphiphysin can drive membrane curvature in vitro and in vivo. The structure is similar to that of arfaptin2, which we find also binds and tubulates membranes. From this, we predict that BAR domains are in many protein families, including sorting nexins, centaurins, and oligophrenins. The universal and minimal BAR domain is a dimerization, membrane-binding, and curvature-sensing module.
Resumo:
Dissertação para obtenção do Grau de Mestre em Genética Molecular e Biomedicina
Resumo:
Endogenous oxidative stress is a likely cause of cardiac myocyte death in vivo. We examined the early (0-2 h) changes in the proteome of isolated cardiac myocytes from neonatal rats exposed to H2O2 (0.1 mM), focussing on proteins with apparent molecular masses of between 20 and 30 kDa. Proteins were separated by two-dimensional gel electrophoresis (2DGE), located by silver-staining and identified by mass spectrometry. Incorporation of [35S]methionine or 32Pi was also studied. For selected proteins, transcript abundance was examined by reverse transcriptase-polymerase chain reaction. Of the 38 protein spots in the region, 23 were identified. Two families showed changes in 2DGE migration or abundance with H2O2 treatment: the peroxiredoxins and two small heat shock protein (Hsp) family members: heat shock 27 kDa protein 1 (Hsp25) and alphaB-crystallin. Peroxiredoxins shifted to lower pI values and this was probably attributable to 'over-oxidation' of active site Cys-residues. Hsp25 also shifted to lower pI values but this was attributable to phosphorylation. alphaB-crystallin migration was unchanged but its abundance decreased. Transcripts encoding peroxiredoxins 2 and 5 increased significantly. In addition, 10 further proteins were identified. For two (glutathione S-transferase pi, translationally-controlled tumour protein), we could not find any previous references indicating their occurrence in cardiac myocytes. We conclude that exposure of cardiac myocytes to oxidative stress causes post-translational modification in two protein families involved in cytoprotection. These changes may be potentially useful diagnostically. In the short term, oxidative stress causes few detectable changes in global protein abundance as assessed by silver-staining.
Resumo:
A joint transcriptomic and proteomic approach employing two-dimensional electrophoresis, liquid chromatography and mass spectrometry was carried out to identify peptides and proteins expressed by the venom gland of the snake Bothrops insularis, an endemic species of Queimada Grande Island, Brazil. Four protein families were mainly represented in processed spots, namely metalloproteinase, serine proteinase, phospholipase A(2) and lectin. Other represented families were growth factors, the developmental protein G10, a disintegrin and putative novel bradykinin-potentiating peptides. The enzymes were present in several isoforms. Most of the experimental data agreed with predicted values for isoelectric point and M(r) of proteins found in the transcriptome of the venom gland. The results also support the existence of posttranslational modifications and of proteolytic processing of precursor molecules which could lead to diverse multifunctional proteins. This study provides a preliminary reference map for proteins and peptides present in Bothrops insularis whole venom establishing the basis for comparative studies of other venom proteomes which could help the search for new drugs and the improvement of venom therapeutics. Altogether, our data point to the influence of transcriptional and post-translational events on the final venom composition and stress the need for a multivariate approach to snake venomics studies. (c) 2009 Elsevier B.V. All rights reserved.