999 resultados para Backbone-cyclized Proteins Database
Resumo:
Snake venoms are complex mixtures of biologically active proteins and peptides. Many affect haemostasis by activating or inhibiting coagulant factors or platelets, or by disrupting endothelium. Snake venom components are classified into various families, such as serine proteases, metalloproteinases, C-type lectin-like proteins, disintegrins and phospholipases. Snake venom C-type lectin-like proteins have a typical fold resembling that in classic C-type lectins such as the selectins and mannose-binding proteins. Many snake venom C-type lectin-like proteins have now been characterized, as heterodimeric structures with alpha and beta subunits that often form large molecules by multimerization. They activate platelets by binding to VWF or specific receptors such as GPIb, alpha2beta1 and GPVI. Simple heterodimeric GPIb-binding molecules mainly inhibit platelet functions, whereas multimeric ones activate platelets. A series of tetrameric snake venom C-type lectin-like proteins activates platelets by binding to GPVI while another series affects platelet function via integrin alpha2beta1. Some act by inducing VWF to bind to GPIb. Many structures of these proteins, often complexed with their ligands, have been determined. Structure-activity studies show that these proteins are quite complex despite similar backbone folding. Snake C-type lectin-like proteins often interact with more than one platelet receptor and have complex mechanisms of action.
Resumo:
Four Staphylococcus aureus-Escherichia coli shuttle vectors were constructed for gene expression and production of tagged fusion proteins. Vectors pBUS1-HC and pTSSCm have no promoter upstream of the multiple cloning site (MCS), and this allows study of genes under the control of their native promoters, and pBUS1-Pcap-HC and pTSSCm-Pcap contain the strong constitutive promoter of S. aureus type 1 capsule gene 1A (Pcap) upstream of a novel MCS harboring codons for the peptide tag Arg-Gly-Ser-hexa-His (rgs-his6). All plasmids contained the backbone derived from pBUS1, including the E. coli origin ColE1, five copies of terminator rrnB T1, and tetracycline resistance marker tet(L) for S. aureus and E. coli. The minimum pAMα1 replicon from pBUS1 was improved through either complementation with the single-strand origin oriL from pUB110 (pBUS1-HC and pBUS1-Pcap-HC) or substitution with a pT181-family replicon (pTSSCm and pTSSCm-Pcap). The new constructs displayed increased plasmid yield and segregational stability in S. aureus. Furthermore, pBUS1-Pcap-HC and pTSSCm-Pcap offer the potential to generate C-terminal RGS-His6 translational fusions of cloned genes using simple molecular manipulation. BcgI-induced DNA excision followed by religation converts the TGA stop codon of the MCS into a TGC codon and links the rgs-his6 codons to the 3' end of the target gene. The generation of the rgs-his6 codon-fusion, gene expression, and protein purification were demonstrated in both S. aureus and E. coli using the macrolide-lincosamide-streptogramin B resistance gene erm(44) inserted downstream of Pcap. The new His tag expression system represents a helpful tool for the direct analysis of target gene function in staphylococcal cells.
Resumo:
Historically morphological features were used as the primary means to classify organisms. However, the age of molecular genetics has allowed us to approach this field from the perspective of the organism's genetic code. Early work used highly conserved sequences, such as ribosomal RNA. The increasing number of complete genomes in the public data repositories provides the opportunity to look not only at a single gene, but at organisms' entire parts list. ^ Here the Sequence Comparison Index (SCI) and the Organism Comparison Index (OCI), algorithms and methods to compare proteins and proteomes, are presented. The complete proteomes of 104 sequenced organisms were compared. Over 280 million full Smith-Waterman alignments were performed on sequence pairs which had a reasonable expectation of being related. From these alignments a whole proteome phylogenetic tree was constructed. This method was also used to compare the small subunit (SSU) rRNA from each organism and a tree constructed from these results. The SSU rRNA tree by the SCI/OCI method looks very much like accepted SSU rRNA trees from sources such as the Ribosomal Database Project, thus validating the method. The SCI/OCI proteome tree showed a number of small but significant differences when compared to the SSU rRNA tree and proteome trees constructed by other methods. Horizontal gene transfer does not appear to affect the SCI/OCI trees until the transferred genes make up a large portion of the proteome. ^ As part of this work, the Database of Related Local Alignments (DaRLA) was created and contains over 81 million rows of sequence alignment information. DaRLA, while primarily used to build the whole proteome trees, can also be applied shared gene content analysis, gene order analysis, and creating individual protein trees. ^ Finally, the standard BLAST method for analyzing shared gene content was compared to the SCI method using 4 spirochetes. The SCI system performed flawlessly, finding all proteins from one organism against itself and finding all the ribosomal proteins between organisms. The BLAST system missed some proteins from its respective organism and failed to detect small ribosomal proteins between organisms. ^
Resumo:
Combinatorial libraries of synthetic and natural products are an important source of molecular information for the interrogation of biological targets. Methods for the intracellular production of libraries of small, stable molecules would be a valuable addition to existing library technologies by combining the discovery potential inherent in small molecules with the large library sizes that can be realized by intracellular methods. We have explored the use of split inteins (internal proteins) for the intracellular catalysis of peptide backbone cyclization as a method for generating proteins and small peptides that are stabilized against cellular catabolism. The DnaE split intein from Synechocystis sp. PCC6803 was used to cyclize the Escherichia coli enzyme dihydrofolate reductase and to produce the cyclic, eight-amino acid tyrosinase inhibitor pseudostellarin F in bacteria. Cyclic dihydrofolate reductase displayed improved in vitro thermostability, and pseudostellarin F production was readily apparent in vivo through its inhibition of melanin production catalyzed by recombinant Streptomyces antibioticus tyrosinase. The ability to generate and screen for backbone cyclic products in vivo is an important milestone toward the goal of generating intracellular cyclic peptide and protein libraries.
Resumo:
The parasitic bacterium Mycoplasma genitalium has a small, reduced genome with close to a basic set of genes. As a first step toward determining the families of protein domains that form the products of these genes, we have used the multiple sequence programs psi-blast and geanfammer to match the sequences of the 467 gene products of M. genitalium to the sequences of the domains that form proteins of known structure [Protein Data Bank (PDB) sequences]. PDB sequences (274) match all of 106 M. genitalium sequences and some parts of another 85; thus, 41% of its total sequences are matched in all or part. The evolutionary relationships of the PDB domains that match M. genitalium are described in the structural classification of proteins (SCOP) database. Using this information, we show that the domains in the matched M. genitalium sequences come from 114 superfamilies and that 58% of them have arisen by gene duplication. This level of duplication is more than twice that found by using pairwise sequence comparisons. The PDB domain matches also describe the domain structure of the matched sequences: just over a quarter contain one domain and the rest have combinations of two or more domains.
Resumo:
Amphipols are a new class of surfactants that make it possible to handle membrane proteins in detergent-free aqueous solution as though they were soluble proteins. The strongly hydrophilic backbone of these polymers is grafted with hydrophobic chains, making them amphiphilic. Amphipols are able to stabilize in aqueous solution under their native state four well-characterized integral membrane proteins: (i) bacteriorhodopsin, (ii) a bacterial photosynthetic reaction center, (iii) cytochrome b6f, and (iv) matrix porin.
Resumo:
The Kabat Database was initially started in 1970 to determine the combining site of antibodies based on the available amino acid sequences. The precise delineation of complementarity determining regions (CDR) of both light and heavy chains provides the first example of how properly aligned sequences can be used to derive structural and functional information of biological macromolecules. This knowledge has subsequently been applied to the construction of artificial antibodies with prescribed specificities, and to many other studies. The Kabat database now includes nucleotide sequences, sequences of T cell receptors for antigens (TCR), major histocompatibility complex (MHC) class I and II molecules, and other proteins of immunological interest. While new sequences are continually added into this database, we have undertaken the task of developing more analytical methods to study the information content of this collection of aligned sequences. New examples of analysis will be illustrated on a yearly basis. The Kabat Database and its applications are freely available at http://immuno.bme.nwu.edu.
Resumo:
Ligand-Gated Ion Channels (LGIC) are polymeric transmembrane proteins involved in the fast response to numerous neurotransmitters. All these receptors are formed by homologous subunits and the last two decades revealed an unexpected wealth of genes coding for these subunits. The Ligand-Gated Ion Channel database (LGICdb) has been developed to handle this increasing amount of data. The database aims to provide only one entry for each gene, containing annotated nucleic acid and protein sequences. The repository is carefully structured and the entries can be retrieved by various criteria. In addition to the sequences, the LGICdb provides multiple sequence alignments, phylogenetic analyses and atomic coordinates when available. The database is accessible via the World Wide Web (http://www.pasteur.fr/recherche/banques/LGIC/LGIC.html), where it is continuously updated. The version 16 (September 2000) available for download contained 333 entries covering 34 species.
Resumo:
Signal recognition particle (SRP) is a stable cytoplasmic ribonucleoprotein complex that serves to translocate secretory proteins across membranes during translation. The SRP Database (SRPDB) provides compilations of SRP components, ordered alphabetically and phylogenetically. Alignments emphasize phylogenetically-supported base pairs in SRP RNA and conserved residues in the proteins. Data are provided in various formats including a column arrangement for improved access and simplified computational usability. Included are motifs for identification of new sequences, SRP RNA secondary structure diagrams, 3-D models and links to high-resolution structures. This release includes 11 new SRP RNA sequences (total of 129), two protein SRP9 sequences (total of seven), two protein SRP14 sequences (total of 10), two protein SRP19 sequences (total of 16), 10 new SRP54 (ffh) sequences (total of 66), two protein SRP68 sequences (total of seven) and two protein SRP72 sequences (total of nine). Seven sequences of the SRP receptor α-subunit and its FtsY homolog (total of 51) are new. Also considered are β-subunit of SRP receptor, Flhf, Hbsu, CaM kinase II and cpSRP43. Access to SRPDB is at http://psyche.uthct.edu/dbs/SRPDB/SRPDB.html and the European mirror http://www.medkem.gu.se/dbs/SRPDB/SRPDB.html
Resumo:
The IMGT/HLA Database (www.ebi.ac.uk/imgt/hla/) specialises in sequences of polymorphic genes of the HLA system, the human major histocompatibility complex (MHC). The HLA complex is located within the 6p21.3 region on the short arm of human chromosome 6 and contains more than 220 genes of diverse function. Many of the genes encode proteins of the immune system and these include the 21 highly polymorphic HLA genes, which influence the outcome of clinical transplantation and confer susceptibility to a wide range of non-infectious diseases. The database contains sequences for all HLA alleles officially recognised by the WHO Nomenclature Committee for Factors of the HLA System and provides users with online tools and facilities for their retrieval and analysis. These include allele reports, alignment tools and detailed descriptions of the source cells. The online IMGT/HLA submission tool allows both new and confirmatory sequences to be submitted directly to the WHO Nomenclature Committee. The latest version (release 1.7.0 July 2000) contains 1220 HLA alleles derived from over 2700 component sequences from the EMBL/GenBank/DDBJ databases. The HLA database provides a model which will be extended to provide specialist databases for polymorphic MHC genes of other species.
Resumo:
The Helix Research Institute (HRI) in Japan is releasing 4356 HUman Novel Transcripts and related information in the newly established HUNT database. The institute is a joint research project principally funded by the Japanese Ministry of International Trade and Industry, and the clones were sequenced in the governmental New Energy and Industrial Technology Development Organization (NEDO) Human cDNA Sequencing Project. The HUNT database contains an extensive amount of annotation from advanced analysis and represents an essential bioinformatics contribution towards understanding of the gene function. The HRI human cDNA clones were obtained from full-length enriched cDNA libraries constructed with the oligo-capping method and have resulted in novel full-length cDNA sequences. A large fraction has little similarity to any proteins of known function and to obtain clues about possible function we have developed original analysis procedures. Any putative function deduced here can be validated or refuted by complementary analysis results. The user can also extract information from specific categories like PROSITE patterns, PFAM domains, PSORT localization, transmembrane helices and clones with GENIUS structure assignments. The HUNT database can be accessed at http://www.hri.co.jp/HUNT.
Resumo:
Aminoacyl-tRNA synthetases (AARSs) are at the center of the question of the origin of life. They constitute a family of enzymes integrating the two levels of cellular organization: nucleic acids and proteins. AARSs arose early in evolution and are believed to be a group of ancient proteins. They are responsible for attaching amino acid residues to their cognate tRNA molecules, which is the first step in the protein synthesis. The role they play in a living cell is essential for the precise deciphering of the genetic code. The analysis of AARSs evolutionary history was not possible for a long time due to a lack of a sufficiently large number of their amino acid sequences. The emerging picture of synthetases’ evolution is a result of recent achievements in genomics [Woese,C., Olsen,G.J., Ibba,M. and Söll,D. (2000) Microbiol. Mol. Biol. Rev., 64, 202–236]. In this paper we present a short introduction to the AARSs database. The updated database contains 1047 AARS primary structures from archaebacteria, eubacteria, mitochondria, chloroplasts and eukaryotic cells. It is the compilation of amino acid sequences of all AARSs known to date, which are available as separate entries via the WWW at http://biobase s.ibch.poznan.pl/aars/.
Resumo:
The Conserved Key Amino Acid Positions DataBase (CKAAPs DB) provides access to an analysis of structurally similar proteins with dissimilar sequences where key residues within a common fold are identified. The derivation and significance of CKAAPs starting from pairwise structure alignments is described fully in Reddy et al. [Reddy,B.V.B., Li,W.W., Shindyalov,I.N. and Bourne,P.E. (2000) Proteins, in press]. The CKAAPs identified from this theoretical analysis are provided to experimentalists and theoreticians for potential use in protein engineering and modeling. It has been suggested that CKAAPs may be crucial features for protein folding, structural stability and function. Over 170 substructures, as defined by the Combinatorial Extension (CE) database, which are found in approximately 3000 representative polypeptide chains have been analyzed and are available in the CKAAPs DB. CKAAPs DB also provides CKAAPs of the representative set of proteins derived from the CE and FSSP databases. Thus the database contains over 5000 representative polypeptide chains, covering all known structures in the PDB. A web interface to a relational database permits fast retrieval of structure-sequence alignments, CKAAPs and associated statistics. Users may query by PDB ID, protein name, function and Enzyme Classification number. Users may also submit protein alignments of their own to obtain CKAAPs. An interface to display CKAAPs on each structure from a web browser is also being implemented. CKAAPs DB is maintained by the San Diego Supercomputer Center and accessible at the URL http://ckaaps.sdsc.edu.
Resumo:
The Biomolecular Interaction Network Database (BIND; http://binddb.org) is a database designed to store full descriptions of interactions, molecular complexes and pathways. Development of the BIND 2.0 data model has led to the incorporation of virtually all components of molecular mechanisms including interactions between any two molecules composed of proteins, nucleic acids and small molecules. Chemical reactions, photochemical activation and conformational changes can also be described. Everything from small molecule biochemistry to signal transduction is abstracted in such a way that graph theory methods may be applied for data mining. The database can be used to study networks of interactions, to map pathways across taxonomic branches and to generate information for kinetic simulations. BIND anticipates the coming large influx of interaction information from high-throughput proteomics efforts including detailed information about post-translational modifications from mass spectrometry. Version 2.0 of the BIND data model is discussed as well as implementation, content and the open nature of the BIND project. The BIND data specification is available as ASN.1 and XML DTD.