926 resultados para Protein Sequence
Resumo:
Abscisic acid (ABA)-mediated gene expression is a critical component of plant responses to this important hormone, which affects plant growth, development, and responses to environmental stresses. Plant responses to ABA are mediated by a number of factors including PKABA1, an ABA induced protein kinase involved in ABA-suppressed gene expression in cereal grains, and TaWD40, which has previously been shown to physically interact with PKABA1. A full-length 1.9 kb TaWD40 cDNA, CK210682, was sequenced as part of this project. Based on the deduced protein sequence, it is thought that TaWD40 may belong to the family of E3 ubiquitin ligases, possibly targeting PKABA1 for destruction. Construction of expression plasmids for overproduction of the TaWD40 polypeptide in E. coli is currently underway. The TaWD40 cDNA has been successfully amplified from the source plasmid and inserted into an intermediate plasmid, pCR2.1. The TaWD40 cDNA is currently being cloned from the pCR2.1 intermediate plasmid into two different expression vectors, pRSET-A and pMAL-c2x, for future protein production and purification.
Resumo:
Within about 30 years the Brazilian buffalo (Bubalus bubalis) herd will reach approximately 50 million head as a result of the great adaptive capacity of these animals to tropical climates, together with the good productive and reproductive potential which make these animals an important animal protein source for poor and developing countries. The myostatin gene (GDF8) is important in the physiology of stock animals because its product produces a direct effect on muscle development and consequently also on meat production. The myostatin sequence is known in several mammalian species and shows a high degree of amino acid sequence conservation, although the presence of non-silent and silent changes in the coding sequences and several alterations in the introns and untranslated regions have been identified. The objective of our work was to characterize the myostatin coding regions of B. bubalis (Murrah breed) and to compare them with the Bos taurus regions looking for variations in nucleotide and protein sequences. In this way, we were able to identify 12 variations at DNA level and five alterations on the presumed myostatin protein sequence as compared to non double-muscled bovine sequences.
Resumo:
A lectin-like protein from the seeds of Acacia farnesiana was isolated from the albumin fraction, characterized, and sequenced by tandem mass spectrometry. The albumin fraction was extracted with 0.5 M NaCl, and the lectin-like protein of A. farnesiana (AFAL) was purified by ion-exchange chromatography (Mono-Q) followed by chromatofocusing. AFAL agglutinated rabbit erythrocytes and did not agglutinate human ABO erythrocytes either native or treated with proteolytic enzymes. In sodium dodecyl sulfate gel electrophoresis under reducing and nonreducing conditions, AFAL separated into two bands with a subunit molecular mass of 35 and 50 kDa. The homogeneity of purified protein was confirmed by chromatofocusing with a pI=4.0+/-0.5. Molecular exclusion chromatography confirmed time-dependent oligomerization in AFAL, in accordance with mass spectrometry analysis, which confers an alteration in AFAL affinity for chitin. The protein sequence was obtained by a liquid chromatography quadrupole time-of-flight experiment and showed that AFAL has 68% and 63% sequence similarity with lectins of Phaseolus vulgaris and Dolichos biflorus, respectively.
Resumo:
Within about 30 years the Brazilian buffalo (Bubalus bubalis) herd will reach approximately 50 million head as a result of the great adaptive capacity of these animals to tropical climates, together with the good productive and reproductive potential which make these animals an important animal protein source for poor and developing countries. The myostatin gene (GDF8) is important in the physiology of stock animals because its product produces a direct effect on muscle development and consequently also on meat production. The myostatin sequence is known in several mammalian species and shows a high degree of amino acid sequence conservation, although the presence of non-silent and silent changes in the coding sequences and several alterations in the introns and untranslated regions have been identified. The objective of our work was to characterize the myostatin coding regions of B. bubalis (Murrah breed) and to compare them with the Bos taurus regions looking for variations in nucleotide and protein sequences. In this way, we were able to identify 12 variations at DNA level and five alterations on the presumed myostatin protein sequence as compared to non double-muscled bovine sequences.
Resumo:
The vast majority of known proteins have not yet been experimentally characterized and little is known about their function. The design and implementation of computational tools can provide insight into the function of proteins based on their sequence, their structure, their evolutionary history and their association with other proteins. Knowledge of the three-dimensional (3D) structure of a protein can lead to a deep understanding of its mode of action and interaction, but currently the structures of <1% of sequences have been experimentally solved. For this reason, it became urgent to develop new methods that are able to computationally extract relevant information from protein sequence and structure. The starting point of my work has been the study of the properties of contacts between protein residues, since they constrain protein folding and characterize different protein structures. Prediction of residue contacts in proteins is an interesting problem whose solution may be useful in protein folding recognition and de novo design. The prediction of these contacts requires the study of the protein inter-residue distances related to the specific type of amino acid pair that are encoded in the so-called contact map. An interesting new way of analyzing those structures came out when network studies were introduced, with pivotal papers demonstrating that protein contact networks also exhibit small-world behavior. In order to highlight constraints for the prediction of protein contact maps and for applications in the field of protein structure prediction and/or reconstruction from experimentally determined contact maps, I studied to which extent the characteristic path length and clustering coefficient of the protein contacts network are values that reveal characteristic features of protein contact maps. Provided that residue contacts are known for a protein sequence, the major features of its 3D structure could be deduced by combining this knowledge with correctly predicted motifs of secondary structure. In the second part of my work I focused on a particular protein structural motif, the coiled-coil, known to mediate a variety of fundamental biological interactions. Coiled-coils are found in a variety of structural forms and in a wide range of proteins including, for example, small units such as leucine zippers that drive the dimerization of many transcription factors or more complex structures such as the family of viral proteins responsible for virus-host membrane fusion. The coiled-coil structural motif is estimated to account for 5-10% of the protein sequences in the various genomes. Given their biological importance, in my work I introduced a Hidden Markov Model (HMM) that exploits the evolutionary information derived from multiple sequence alignments, to predict coiled-coil regions and to discriminate coiled-coil sequences. The results indicate that the new HMM outperforms all the existing programs and can be adopted for the coiled-coil prediction and for large-scale genome annotation. Genome annotation is a key issue in modern computational biology, being the starting point towards the understanding of the complex processes involved in biological networks. The rapid growth in the number of protein sequences and structures available poses new fundamental problems that still deserve an interpretation. Nevertheless, these data are at the basis of the design of new strategies for tackling problems such as the prediction of protein structure and function. Experimental determination of the functions of all these proteins would be a hugely time-consuming and costly task and, in most instances, has not been carried out. As an example, currently, approximately only 20% of annotated proteins in the Homo sapiens genome have been experimentally characterized. A commonly adopted procedure for annotating protein sequences relies on the "inheritance through homology" based on the notion that similar sequences share similar functions and structures. This procedure consists in the assignment of sequences to a specific group of functionally related sequences which had been grouped through clustering techniques. The clustering procedure is based on suitable similarity rules, since predicting protein structure and function from sequence largely depends on the value of sequence identity. However, additional levels of complexity are due to multi-domain proteins, to proteins that share common domains but that do not necessarily share the same function, to the finding that different combinations of shared domains can lead to different biological roles. In the last part of this study I developed and validate a system that contributes to sequence annotation by taking advantage of a validated transfer through inheritance procedure of the molecular functions and of the structural templates. After a cross-genome comparison with the BLAST program, clusters were built on the basis of two stringent constraints on sequence identity and coverage of the alignment. The adopted measure explicity answers to the problem of multi-domain proteins annotation and allows a fine grain division of the whole set of proteomes used, that ensures cluster homogeneity in terms of sequence length. A high level of coverage of structure templates on the length of protein sequences within clusters ensures that multi-domain proteins when present can be templates for sequences of similar length. This annotation procedure includes the possibility of reliably transferring statistically validated functions and structures to sequences considering information available in the present data bases of molecular functions and structures.
Resumo:
Phosphatidylinositol transfer proteins (PI-TP's) catalyze the transfer of phosphatidylinositol and phosphatidylcholine between membranes in vitro. However the in vivo function of these proteins is unknown. In this thesis we have used a combined biochemical and genetic approach to determine the importance of PI-TP in vivo. An oligonucleotide based on the amino terminal sequence of the PI-TP from Saccharomyces cerevisiae, was used to screen a yeast genomic library for the gene encoding PI-TP (PIT1 gene). Yeast strains transformed with the positive clones showed overproduction of transfer activities and transfer protein in the 100,000 x g supernatants. The 5$\sp\prime$ terminus of the PIT1 gene correlates with the predicted codons for residues 3-30 of the determined protein sequence. Tetrad analysis of a heterozygous diploid (PIT1/pit1::LEU2) revealed that the PIT1 gene is essential for cell growth. Non-viable spores could be rescued by transformation of the above diploid prior to sporulation, with a plasmid borne copy of the wild type gene. Sequencing of the entire PIT1 gene has revealed that the PIT1 gene is identical to the SEC14 gene. The sec14 ts mutant which exhibits conditional defects at the Golgi stage of protein secretion, is also temperature sensitive for PI-TP activity in vitro. These findings represent the first instance in which a physiological function has been assigned to any phospholipid transfer protein. ^
VERIFICATION OF DNA PREDICTED PROTEIN SEQUENCES BY ENZYME HYDROLYSIS AND MASS SPECTROMETRIC ANALYSIS
Resumo:
The focus of this thesis lies in the development of a sensitive method for the analysis of protein primary structure which can be easily used to confirm the DNA sequence of a protein's gene and determine the modifications which are made after translation. This technique involves the use of dipeptidyl aminopeptidase (DAP) and dipeptidyl carboxypeptidase (DCP) to hydrolyze the protein and the mass spectrometric analysis of the dipeptide products.^ Dipeptidyl carboxypeptidase was purified from human lung tissue and characterized with respect to its proteolytic activity. The results showed that the enzyme has a relatively unrestricted specificity, making it useful for the analysis of the C-terminal of proteins. Most of the dipeptide products were identified using gas chromatography/mass spectrometry (GC/MS). In order to analyze the peptides not hydrolyzed by DCP and DAP, as well as the dipeptides not identified by GC/MS, a FAB ion source was installed on a quadrupole mass spectrometer and its performance evaluated with a variety of compounds.^ Using these techniques, the sequences of the N-terminal and C-terminal regions and seven fragments of bacteriophage P22 tail protein have been verified. All of the dipeptides identified in these analysis were in the same DNA reading frame, thus ruling out the possibility of a single base being inserted or deleted from the DNA sequence. The verification of small sequences throughout the protein sequence also indicates that no large portions of the protein have been removed after translation. ^
Resumo:
Lodestar, a Drosophila maternal-effect gene, is essential for proper chromosome segregation during embryonic mitosis. Mutations in lodestar cause chromatin bridging in anaphase, preventing the sister chromatids from fully separating and leaving chromatin tangled at the metaphase plate. Drosophila lodestar protein was originally identified, in purified fractions of Drosophila Kc cell nuclear extracts, by its ability to suppress the generation of long RNA polymerase II transcripts. The human homolog of this protein (hLodestar) was cloned and studied in comparison to the Drosophila lodestar activities. The results of these studies show, similar to the Drosophila protein, hLodestar has dsDNA-dependent ATPase and transcription termination activity in vitro. hLodestar has also been shown to release RNA polymerase I and II stalled at a cyclobutane thymine dimer. Lodestar belongs to the SNF2 family of proteins, which are members of the DExH/D helicase super-family. The SNF2 family of proteins are believed to play a critical role in altering protein-DNA interactions in a variety of cellular contexts. We have recently isolated a human cDNA (hLodestar) that shares significant homology to the Drosophila lodestar gene. The 4.6 kb clone contains an open reading frame of 1162 amino acids, and shares 55% similarity and 46% identity to the Drosophila Lodestar protein sequence. Our studies looking for hLodestar interacting proteins revealed an association with CDC5L in the yeast two-hybrid system and co-immunoprecipitation experiments. CDC5L has been well documented to be a component of the spliceosome. Our data suggests hLodestar is involved in splicing through in vitro assembly and splicing reactions, in addition to its association with spliceosomes purified from HeLa nuclear extract. Although many other members of the DExH/D helicase super-family have been linked to splicing, this is the first SNF2 family member to be implicated in the splicing reaction. ^
Resumo:
Insulin-like growth factor binding protein 2 (IGFBP2) is a protein known to be overexpressed in a majority of glioblastoma multiforme (GBM) tumors. While it is known the IGFBP2 is involved in promoting GBM tumor cell invasion, no mechanism exists for how the protein is involved in signal transduction pathways leading to enhanced cell invasion. ^ We follow up on preliminary microarray data on IGFBP2-overexpressing GBM cells and protein sequence analysis of IGFBP2 in generating the hypothesis that IGFBP2 interacts with integnn α5 in regulating cell mobility. Microarray data showing upregulation of integrin α5 by IGFBP2 is validated and evidence of protein-protein interaction between IGFBP2 and integrin α5 is found. The exact binding domain on IGFBP2 responsible for its interaction with integrin α5 is also determined, confirming our initial findings and reaffirming that the IGFBP2/integrin α5 interaction is specific. Disruption of this interaction resulted in attenuation of IGFBP2-enhanced cell mobility. Further, we found that cell mobility is only enhanced when IGFBP2 and integrin α5 are both overexpressed and able to interact with each other. ^ We also determined fibronectin to be a critical player in the activation of the IGFBP2/integrin α5 pathway. The activation of this pathway appears to be progressive and initiates once GBM cells have sufficiently established anchorage. ^
Resumo:
We have identified and characterized CLARP, a caspase-like apoptosis-regulatory protein. Sequence analysis revealed that human CLARP contains two amino-terminal death effector domains fused to a carboxyl-terminal caspase-like domain. The structure and amino acid sequence of CLARP resemble those of caspase-8, caspase-10, and DCP2, a Drosophila melanogaster protein identified in this study. Unlike caspase-8, caspase-10, and DCP2, however, two important residues predicted to be involved in catalysis were lost in the caspase-like domain of CLARP. Analysis with fluorogenic substrates for caspase activity confirmed that CLARP is catalytically inactive. CLARP was found to interact with caspase-8 but not with FADD/MORT-1, an upstream death effector domain-containing protein of the Fas and tumor necrosis factor receptor 1 signaling pathway. Expression of CLARP induced apoptosis, which was blocked by the viral caspase inhibitor p35, dominant negative mutant caspase-8, and the synthetic caspase inhibitor benzyloxycarbonyl-Val-Ala-Asp-(OMe)-fluoromethylketone (zVAD-fmk). Moreover, CLARP augmented the killing ability of caspase-8 and FADD/MORT-1 in mammalian cells. The human clarp gene maps to 2q33. Thus, CLARP represents a regulator of the upstream caspase-8, which may play a role in apoptosis during tissue development and homeostasis.
Resumo:
Site-directed mutagenesis and combinatorial libraries are powerful tools for providing information about the relationship between protein sequence and structure. Here we report two extensions that expand the utility of combinatorial mutagenesis for the quantitative assessment of hypotheses about the determinants of protein structure. First, we show that resin-splitting technology, which allows the construction of arbitrarily complex libraries of degenerate oligonucleotides, can be used to construct more complex protein libraries for hypothesis testing than can be constructed from oligonucleotides limited to degenerate codons. Second, using eglin c as a model protein, we show that regression analysis of activity scores from library data can be used to assess the relative contributions to the specific activity of the amino acids that were varied in the library. The regression parameters derived from the analysis of a 455-member sample from a library wherein four solvent-exposed sites in an α-helix can contain any of nine different amino acids are highly correlated (P < 0.0001, R2 = 0.97) to the relative helix propensities for those amino acids, as estimated by a variety of biophysical and computational techniques.
Resumo:
The discovery of cyanobacterial phytochrome histidine kinases, together with the evidence that phytochromes from higher plants display protein kinase activity, bind ATP analogs, and possess C-terminal domains similar to bacterial histidine kinases, has fueled the controversial hypothesis that the eukaryotic phytochrome family of photoreceptors are light-regulated enzymes. Here we demonstrate that purified recombinant phytochromes from a higher plant and a green alga exhibit serine/threonine kinase activity similar to that of phytochrome isolated from dark grown seedlings. Phosphorylation of recombinant oat phytochrome is a light- and chromophore-regulated intramolecular process. Based on comparative protein sequence alignments and biochemical cross-talk experiments with the response regulator substrate of the cyanobacterial phytochrome Cph1, we propose that eukaryotic phytochromes are histidine kinase paralogs with serine/threonine specificity whose enzymatic activity diverged from that of a prokaryotic ancestor after duplication of the transmitter module.
Resumo:
The Protein Information Resource, in collaboration with the Munich Information Center for Protein Sequences (MIPS) and the Japan International Protein Information Database (JIPID), produces the most comprehensive and expertly annotated protein sequence database in the public domain, the PIR-International Protein Sequence Database. To provide timely and high quality annotation and promote database interoperability, the PIR-International employs rule-based and classification-driven procedures based on controlled vocabulary and standard nomenclature and includes status tags to distinguish experimentally determined from predicted protein features. The database contains about 200 000 non-redundant protein sequences, which are classified into families and superfamilies and their domains and motifs identified. Entries are extensively cross-referenced to other sequence, classification, genome, structure and activity databases. The PIR web site features search engines that use sequence similarity and database annotation to facilitate the analysis and functional identification of proteins. The PIR-International databases and search tools are accessible on the PIR web site at http://pir.georgetown.edu/ and at the MIPS web site at http://www.mips.biochem.mpg.de. The PIR-International Protein Sequence Database and other files are also available by FTP.
Resumo:
The iProClass database is an integrated resource that provides comprehensive family relationships and structural and functional features of proteins, with rich links to various databases. It is extended from ProClass, a protein family database that integrates PIR superfamilies and PROSITE motifs. The iProClass currently consists of more than 200 000 non-redundant PIR and SWISS-PROT proteins organized with more than 28 000 superfamilies, 2600 domains, 1300 motifs, 280 post-translational modification sites and links to more than 30 databases of protein families, structures, functions, genes, genomes, literature and taxonomy. Protein and family summary reports provide rich annotations, including membership information with length, taxonomy and keyword statistics, full family relationships, comprehensive enzyme and PDB cross-references and graphical feature display. The database facilitates classification-driven annotation for protein sequence databases and complete genomes, and supports structural and functional genomic research. The iProClass is implemented in Oracle 8i object-relational system and available for sequence search and report retrieval at http://pir.georgetow n.edu/iproclass/.
Resumo:
High throughput genome (HTG) and expressed sequence tag (EST) sequences are currently the most abundant nucleotide sequence classes in the public database. The large volume, high degree of fragmentation and lack of gene structure annotations prevent efficient and effective searches of HTG and EST data for protein sequence homologies by standard search methods. Here, we briefly describe three newly developed resources that should make discovery of interesting genes in these sequence classes easier in the future, especially to biologists not having access to a powerful local bioinformatics environment. trEST and trGEN are regularly regenerated databases of hypothetical protein sequences predicted from EST and HTG sequences, respectively. Hits is a web-based data retrieval and analysis system providing access to precomputed matches between protein sequences (including sequences from trEST and trGEN) and patterns and profiles from Prosite and Pfam. The three resources can be accessed via the Hits home page (http://hits.isb-sib.ch).