192 resultados para Proteins -- Biotechnology
Resumo:
High conservation of glycyl residues in homologous proteins is fairly frequent. It is commonly understood that glycine tends to be highly conserved either because of its unique Ramachandran angles or to avoid steric clash that would arise with a larger side chain. Using a database of aligned 3D structures of homologous proteins we identified conserved Gly in 288 alignment positions from 85 families. Ninety-six of these alignment positions correspond to conserved Gly residue with (phi, ) values allowed for non-glycyl residues. Reasons for this observation were investigated by in-silico mutation of these glycyl residues to Ala. We found in 94% of the cases a short contact exists between the C atom of the introduced Ala with the atoms which are often distant in the primary structure. This suggests the lack of space even for a short side chain thereby explaining high conservation of glycyl residues even when they adopt (phi, ) values allowed for Ala. In 189 alignment positions, the conserved glycyl residues adopt (phi, ) values which are disallowed for Ala. In-silico mutation of these Gly residues to Ala almost always results in steric hindrance involving C atom of Ala as one would expect by comparing Ramachandran maps for Ala and Gly. Rare occurrence of the disallowed glycyl conformations even in ultrahigh resolution protein structures are accompanied by short contacts in the crystal structures and such disallowed conformations are not conserved in the homologues. These observations raise the doubt on the accuracy of such glycyl conformations in proteins.
Resumo:
Taxol (R) (generic name paclitaxel) represents one of the most clinically valuable natural products known to mankind in the recent past. More than two decades have elapsed since the notable discovery of the first Taxol (R) producing endophytic fungus, which was followed by a plethora of reports on other endophytes possessing similar biosynthetic potential. However, industrial-scale Taxol (R) production using fungal endophytes, although seemingly promising, has not seen the light of the day. In this opinion article, we embark on the current state of knowledge on Taxol (R) biosynthesis focusing on the chemical ecology of its producers, and ask whether it is actually possible to produce Taxol (R) using endophyte biotechnology. The key problems that have prevented the exploitation of potent endophytic fungi by industrial bioprocesses for sustained production of Taxol (R) are discussed.
Resumo:
Using a dataset of 1164 crystal structures of largely non-homologous proteins defined at a resolution of 1.5 angstrom or better, we have investigated the (phi,psi) preferences of 20 residue types by considering the residues which occur in loops. Propensities of residue types to occur in the loops with (phi,psi) values in the aa region of the Ramachandran map has a poor correlation coefficient of 0.48 to the Chou-Fasman propensities of the residue types to occur in the a-helical segments. However the correlation coefficient between propensities of residues in loops to adopt beta conformations and those in beta-sheet is much higher (0.95). These observations suggest that a-helix formation is well influenced by the local amino acid sequence while intrinsic preference of residue types for beta-sheet plays a major role in the formation of beta-sheet. The main chain polar groups of residues in loops, that can affect the (phi,psi) values, can be involved in intra-molecular hydrogen bonding. Therefore we investigated further by considering subset of residues in loops with low (0 to 2) number of intra-molecular hydrogen bonds per residue involving main chain polar atoms. For this subset, the correlation coefficients between propensities for alpha-helix and alpha(R) region and between beta-sheet and beta-region are 0.26 and 0.64 respectively. This reiterates higher intrinsic tendency of beta-region favouring residues to adopt beta-sheet than alpha(R) region favouring residues to adopt alpha-helical structure.
Resumo:
With the preponderance of multidomain proteins in eukaryotic genomes, it is essential to recognize the constituent domains and their functions. Often function involves communications across the domain interfaces, and the knowledge of the interacting sites is essential to our understanding of the structure-function relationship. Using evolutionary information extracted from homologous domains in at least two diverse domain architectures (single and multidomain), we predict the interface residues corresponding to domains from the two-domain proteins. We also use information from the three-dimensional structures of individual domains of two-domain proteins to train naive Bayes classifier model to predict the interfacial residues. Our predictions are highly accurate (approximate to 85%) and specific (approximate to 95%) to the domain-domain interfaces. This method is specific to multidomain proteins which contain domains in at least more than one protein architectural context. Using predicted residues to constrain domain-domain interaction, rigid-body docking was able to provide us with accurate full-length protein structures with correct orientation of domains. We believe that these results can be of considerable interest toward rational protein and interaction design, apart from providing us with valuable information on the nature of interactions. Proteins 2014; 82:1219-1234. (c) 2013 Wiley Periodicals, Inc.
Resumo:
The HORMA domain (for Hop1p, Rev7p and MAD2) was discovered in three chromatin-associated proteins in the budding yeast Saccharomyces cerevisiae. This domain has also been found in proteins with similar functions in organisms including plants, animals and nematodes. The HORMA domain containing proteins are thought to function as adaptors for meiotic checkpoint protein signaling and in the regulation of meiotic recombination. Surprisingly, new work has disclosed completely unanticipated and diverse functions for the HORMA domain containing proteins. A. M. Villeneuve and colleagues (Schvarzstein et al., 2013) show that meiosis-specific HORMA domain containing proteins plays a vital role in preventing centriole disengagement during Caenorhabditis elegans spermatocyte meiosis. Another recent study reveals that S. cerevisiae Atg13 HORMA domain acts as a phosphorylation-dependent conformational switch in the cellular autophagic process. (C) 2014 Elsevier B.V. All rights reserved.
Resumo:
Thiolases are enzymes involved in lipid metabolism. Thiolases remove the acetyl-CoA moiety from 3-ketoacyl-CoAs in the degradative reaction. They can also catalyze the reverse Claisen condensation reaction, which is the first step of biosynthetic processes such as the biosynthesis of sterols and ketone bodies. In human, six distinct thiolases have been identified. Each of these thiolases is different from the other with respect to sequence, oligomeric state, substrate specificity and subcellular localization. Four sequence fingerprints, identifying catalytic loops of thiolases, have been described. In this study genome searches of two mycobacterial species (Mycobacterium tuberculosis and Mycobacterium smegmatis), were carried out, using the six human thiolase sequences as queries. Eight and thirteen different thiolase sequences were identified in M. tuberculosis and M. smegmatis, respectively. In addition, thiolase-like proteins (one encoded in the Mtb and two in the Msm genome) were found. The purpose of this study is to classify these mostly uncharacterized thiolases and thiolase-like proteins. Several other sequences obtained by searches of genome databases of bacteria, mammals and the parasitic protist family of the Trypanosomatidae were included in the analysis. Thiolase-like proteins were also found in the trypanosomatid genomes, but not in those of mammals. In order to study the phylogenetic relationships at a high confidence level, additional thiolase sequences were included such that a total of 130 thiolases and thiolase-like protein sequences were used for the multiple sequence alignment. The resulting phylogenetic tree identifies 12 classes of sequences, each possessing a characteristic set of sequence fingerprints for the catalytic loops. From this analysis it is now possible to assign the mycobacterial thiolases to corresponding homologues in other kingdoms of life. The results of this bioinformatics analysis also show interesting differences between the distributions of M. tuberculosis and M. smegmatis thiolases over the 12 different classes. (C) 2014 Elsevier Ltd. All rights reserved.
Resumo:
Protein aggregation, linked to many of diseases, is initiated when monomers access rogue conformations that are poised to form amyloid fibrils. We show, using simulations of src SH3 domain, that mechanical force enhances the population of the aggregation-prone (N*) states, which are rarely populated under force free native conditions but are encoded in the spectrum of native fluctuations. The folding phase diagrams of SH3 as a function of denaturant concentration (C]), mechanical force (f), and temperature exhibit an apparent two-state behavior, without revealing the presence of the elusive N* states. Interestingly, the phase boundaries separating the folded and unfolded states at all C] and f fall on a master curve, which can be quantitatively described using an analogy to superconductors in a magnetic field. The free energy profiles as a function of the molecular extension (R), which are accessible in pulling experiments, (R), reveal the presence of a native-like N* with a disordered solvent-exposed amino-terminal beta-strand. The structure of the N* state is identical with that found in Fyn SH3 by NMR dispersion experiments. We show that the timescale for fibril formation can be estimated from the population of the N* state, determined by the free energy gap separating the native structure and the N* state, a finding that can be used to assess fibril forming tendencies of proteins. The structures of the N* state are used to show that oligomer formation and likely route to fibrils occur by a domain-swap mechanism in SH3 domain. (C) 2014 Elsevier Ltd. All rights reserved.
Resumo:
In this study, we combine available high resolution structural information on eukaryotic ribosomes with low resolution cryo-EM data on the Hepatitis C Viral RNA (IRES) human ribosome complex. Aided further by the prediction of RNA-protein interactions and restrained docking studies, we gain insights on their interaction at the residue level. We identified the components involved at the major and minor contact regions, and propose that there are energetically favorable local interactions between 40S ribosomal proteins and IRES domains. Domain II of the IRES interacts with ribosomal proteins S5 and S25 while the pseudoknot and the downstream domain IV region bind to ribosomal proteins S26, S28 and S5. We also provide support using UV cross-linking studies to validate our proposition of interaction between the S5 and IRES domains II and IV. We found that domain IIIe makes contact with the ribosomal protein S3a (S1e). Our model also suggests that the ribosomal protein S27 interacts with domain IIIc while S7 has a weak contact with a single base RNA bulge between junction IIIabc and IIId. The interacting residues are highly conserved among mammalian homologs while IRES RNA bases involved in contact do not show strict conservation. IRES RNA binding sites for S25 and S3a show the best conservation among related viral IRESs. The new contacts identified between ribosomal proteins and RNA are consistent with previous independent studies on RNA-binding properties of ribosomal proteins reported in literature, though information at the residue level is not available in previous studies.
Resumo:
Matrix metalloproteinases expression is used as biomarker for various cancers and associated malignancies. Since these proteinases can cleave many intracellular proteins, overexpression tends to be toxic; hence, a challenge to purify them. To overcome these limitations, we designed a protocol where full length pro-MMP2 enzyme was overexpressed in E. coli as inclusion bodies and purified using 6xHis affinity chromatography under denaturing conditions. In one step, the enzyme was purified and refolded directly on the affinity matrix under redox conditions to obtain a bioactive protein. The pro-MMP2 protein was characterized by mass spectrometry, CD spectroscopy, zymography and activity analysis using a simple in-house developed `form invariant' assay, which reports the total MMP2 activity independent of its various forms. The methodology yielded higher yields of bioactive protein compared to other strategies reported till date, and we anticipate that using the protocol, other toxic proteins can also be overexpressed and purified from E. coli and subsequently refolded into active form using a one step renaturation protocol.
Resumo:
Background: The function of a protein can be deciphered with higher accuracy from its structure than from its amino acid sequence. Due to the huge gap in the available protein sequence and structural space, tools that can generate functionally homogeneous clusters using only the sequence information, hold great importance. For this, traditional alignment-based tools work well in most cases and clustering is performed on the basis of sequence similarity. But, in the case of multi-domain proteins, the alignment quality might be poor due to varied lengths of the proteins, domain shuffling or circular permutations. Multi-domain proteins are ubiquitous in nature, hence alignment-free tools, which overcome the shortcomings of alignment-based protein comparison methods, are required. Further, existing tools classify proteins using only domain-level information and hence miss out on the information encoded in the tethered regions or accessory domains. Our method, on the other hand, takes into account the full-length sequence of a protein, consolidating the complete sequence information to understand a given protein better. Results: Our web-server, CLAP (Classification of Proteins), is one such alignment-free software for automatic classification of protein sequences. It utilizes a pattern-matching algorithm that assigns local matching scores (LMS) to residues that are a part of the matched patterns between two sequences being compared. CLAP works on full-length sequences and does not require prior domain definitions. Pilot studies undertaken previously on protein kinases and immunoglobulins have shown that CLAP yields clusters, which have high functional and domain architectural similarity. Moreover, parsing at a statistically determined cut-off resulted in clusters that corroborated with the sub-family level classification of that particular domain family. Conclusions: CLAP is a useful protein-clustering tool, independent of domain assignment, domain order, sequence length and domain diversity. Our method can be used for any set of protein sequences, yielding functionally relevant clusters with high domain architectural homogeneity. The CLAP web server is freely available for academic use at http://nslab.mbu.iisc.ernet.in/clap/.
Resumo:
-helices are amongst the most common secondary structural elements seen in membrane proteins and are packed in the form of helix bundles. These -helices encounter varying external environments (hydrophobic, hydrophilic) that may influence the sequence preferences at their N and C-termini. The role of the external environment in stabilization of the helix termini in membrane proteins is still unknown. Here we analyze -helices in a high-resolution dataset of integral -helical membrane proteins and establish that their sequence and conformational preferences differ from those in globular proteins. We specifically examine these preferences at the N and C-termini in helices initiating/terminating inside the membrane core as well as in linkers connecting these transmembrane helices. We find that the sequence preferences and structural motifs at capping (Ncap and Ccap) and near-helical (N' and C') positions are influenced by a combination of features including the membrane environment and the innate helix initiation and termination property of residues forming structural motifs. We also find that a large number of helix termini which do not form any particular capping motif are stabilized by formation of hydrogen bonds and hydrophobic interactions contributed from the neighboring helices in the membrane protein. We further validate the sequence preferences obtained from our analysis with data from an ultradeep sequencing study that identifies evolutionarily conserved amino acids in the rat neurotensin receptor. The results from our analysis provide insights for the secondary structure prediction, modeling and design of membrane proteins. Proteins 2014; 82:3420-3436. (c) 2014 Wiley Periodicals, Inc.
Resumo:
Cis-peptide embedded segments are rare in proteins but often highlight their important role in molecular function when they do occur. The high evolutionary conservation of these segments illustrates this observation almost universally, although no attempt has been made to systematically use this information for the purpose of function annotation. In the present study, we demonstrate how geometric clustering and level-specific Gene Ontology molecular-function terms (also known as annotations) can be used in a statistically significant manner to identify cis-embedded segments in a protein linked to its molecular function. The present study identifies novel cis-peptide fragments, which are subsequently used for fragment-based function annotation. Annotation recall benchmarks interpreted using the receiver-operator characteristic plot returned an area-under-curve >0.9, corroborating the utility of the annotation method. In addition, we identified cis-peptide fragments occurring in conjunction with functionally important trans-peptide fragments, providing additional insights into molecular function. We further illustrate the applicability of our method in function annotation where homology-based annotation transfer is not possible. The findings of the present study add to the repertoire of function annotation approaches and also facilitate engineering, design and allied studies around the cis-peptide neighborhood of proteins.
Resumo:
Streptococcus pneumoniae causes pneumonia, septicemia and meningitis. S. pneumoniae is responsible for significant mortality both in children and in the elderly. In recent years, the whole genome sequencing of various S. pneumoniae strains have increased manifold and there is an urgent need to provide organism specific annotations to the scientific community. This prompted us to develop the Streptococcus pneumoniae Genome Database (SPGDB) to integrate and analyze the completely sequenced and available S. pneumoniae genome sequences. Further, links to several tools are provided to compare the pool of gene and protein sequences, and proteins structure across different strains of S. pneumoniae. SPGDB aids in the analysis of phenotypic variations as well as to perform extensive genomics and evolutionary studies with reference to S. pneumoniae. (C) 2014 Elsevier Inc. All rights reserved.
Resumo:
The cyclic AMP receptor protein (CRP) family of transcription factors consists of global regulators of bacterial gene expression. Here, we identify two paralogous CRPs in the genome of Mycobacterium smegmatis that have 78% identical sequences and characterize them biochemically and functionally. The two proteins (MSMEG_0539 and MSMEG_6189) show differences in cAMP binding affinity, trypsin sensitivity, and binding to a CRP site that we have identified upstream of the msmeg_3781 gene. MSMEG_6189 binds to the CRP site readily in the absence of cAMP, while MSMEG_0539 binds in the presence of cAMP, albeit weakly. msmeg_6189 appears to be an essential gene, while the ?msmeg_0539 strain was readily obtained. Using promoter-reporter constructs, we show that msmeg_3781 is regulated by CRP binding, and its transcription is repressed by MSMEG_6189. Our results are the first to characterize two paralogous and functional CRPs in a single bacterial genome. This gene duplication event has subsequently led to the evolution of two proteins whose biochemical differences translate to differential gene regulation, thus catering to the specific needs of the organism.
Resumo:
In many organisms ``Universal Stress Proteins'' CUSPS) are induced in response to a variety of environmental stresses. Here we report the structures of two USPs, YnaF and YdaA from Salmonella typhimurium determined at 1.8 angstrom and 2.4 angstrom resolutions, respectively. YnaF consists of a single USP domain and forms a tetrameric organization stabilized by interactions mediated through chloride ions. YdaA is a larger protein consisting of two tandem USP domains. Two protomers of YdaA associate to form a structure similar to the YnaF tetramer. YdaA showed ATPase activity and an ATP binding motif G-2X-G-9X-G(S/T/N) was found in its C-terminal domain. The residues corresponding to this motif were not conserved in YnaF although YnaF could bind ATP. However, unlike YdaA, YnaF did not hydrolyse ATP in vitro. Disruption of interactions mediated through chloride ions by selected mutations converted YnaF into an ATPase. Residues that might be important for ATP hydrolysis could be identified by comparing the active sites of native and mutant structures. Only the C-terminal domain of YdaA appears to be involved in ATP hydrolysis. The structurally similar N-terminal domain was found to bind a zinc ion near the segment equivalent to the phosphate binding loop of the C-terminal domain. Mass spectrometric analysis showed that YdaA might bind a ligand of approximate molecular weight 800 daltons. Structural comparisons suggest that the ligand, probably related to an intermediate in lipid A biosynthesis, might bind at a site close to the zinc ion. Therefore, the N-terminal domain of YdaA binds zinc and might play a role in lipid metabolism. Thus, USPs appear to perform several distinct functions such as ATP hydrolysis, altering membrane properties and chloride sensing. (C) 2015 Elsevier Inc. All rights reserved.