15 resultados para Database accession number
em National Center for Biotechnology Information - NCBI
Resumo:
GlycoSuiteDB is a relational database that curates information from the scientific literature on glycoprotein derived glycan structures, their biological sources, the references in which the glycan was described and the methods used to determine the glycan structure. To date, the database includes most published O-linked oligosaccharides from the last 50 years and most N-linked oligosaccharides that were published in the 1990s. For each structure, information is available concerning the glycan type, linkage and anomeric configuration, mass and composition. Detailed information is also provided on native and recombinant sources, including tissue and/or cell type, cell line, strain and disease state. Where known, the proteins to which the glycan structures are attached are reported, and cross-references to the SWISS-PROT/TrEMBL protein sequence databases are given if applicable. The GlycoSuiteDB annotations include literature references which are linked to PubMed, and detailed information on the methods used to determine each glycan structure are noted to help the user assess the quality of the structural assignment. GlycoSuiteDB has a user-friendly web interface which allows the researcher to query the database using monoisotopic or average mass, monosaccharide composition, glycosylation linkages (e.g. N- or O-linked), reducing terminal sugar, attached protein, taxonomy, tissue or cell type and GlycoSuiteDB accession number. Advanced queries using combinations of these parameters are also possible. GlycoSuiteDB can be accessed on the web at http://www.glycosuite.com.
Resumo:
PseudoBase is a database containing structural, functional and sequence data related to RNA pseudoknots. It can be reached at http://wwwbio.LeidenUniv.nl/∼Batenburg/PKB.html. For each pseudoknot, thirteen items are stored, for example the relevant sequence, the stem positions of the pseudoknot, the EMBL accession number of the sequence and the support that can be given regarding the reliability of the pseudoknot. Since the last publication, information on sizes of the stems and the loops in the pseudoknots has been added. Also added are alternative entries that produce surveys of where the pseudoknots are, sorted according to stem size or loop size.
Resumo:
The Ribosomal RNA Operon Copy Number Database (rrndb) is an Internet-accessible database containing annotated information on rRNA operon copy number among prokaryotes. Gene redundancy is uncommon in prokaryotic genomes, yet the rRNA genes can vary from one to as many as 15 copies. Despite the widespread use of 16S rRNA gene sequences for identification of prokaryotes, information on the number and sequence of individual rRNA genes in a genome is not readily accessible. In an attempt to understand the evolutionary implications of rRNA operon redundancy, we have created a phylogenetically arranged report on rRNA gene copy number for a diverse collection of prokaryotic microorganisms. Each entry (organism) in the rrndb contains detailed information linked directly to external websites including the Ribosomal Database Project, GenBank, PubMed and several culture collections. Data contained in the rrndb will be valuable to researchers investigating microbial ecology and evolution using 16S rRNA gene sequences. The rrndb web site is directly accessible on the WWW at http://rrndb.cme.msu.edu.
Resumo:
The nuclear and mitochondrial genomes coevolve to optimize approximately 100 different interactions necessary for an efficient ATP-generating system. This coevolution led to a species-specific compatibility between these genomes. We introduced mitochondrial DNA (mtDNA) from different primates into mtDNA-less human cells and selected for growth of cells with a functional oxidative phosphorylation system. mtDNA from common chimpanzee, pigmy chimpanzee, and gorilla were able to restore oxidative phosphorylation in the context of a human nuclear background, whereas mtDNA from orangutan, and species representative of Old-World monkeys, New-World monkeys, and lemurs were not. Oxygen consumption, a sensitive index of respiratory function, showed that mtDNA from chimpanzee, pigmy chimpanzee, and gorilla replaced the human mtDNA and restored respiration to essentially normal levels. Mitochondrial protein synthesis was also unaltered in successful “xenomitochondrial cybrids.” The abrupt failure of mtDNA from primate species that diverged from humans as recently as 8–18 million years ago to functionally replace human mtDNA suggests the presence of one or a few mutations affecting critical nuclear–mitochondrial genome interactions between these species. These cellular systems provide a demonstration of intergenus mtDNA transfer, expand more than 20-fold the number of mtDNA polymorphisms that can be analyzed in a human nuclear background, and provide a novel model for the study of nuclear–mitochondrial interactions.
Resumo:
Sequences of the variable heavy (VH) and κ (Vκ) domains of Ig structures were divided into 21 fragments that correspond to strands, loops, or parts of these structural units of the variable domains. Amino acid sequences of fragments (termed “words”) were collected from the 1,172 human heavy and 668 human κ chains available in the Kabat database. Statistical analysis of words of 17 fragments was performed (fragments that comprise the complementary determining regions′ fragments will not be discussed in this paper). The number of different words (those with different residues in at least one position) ranged, for various fragments, from 11 to 75 in the κ chains, and from 23 to 189 in the heavy chains. The main result of this study is that very few keywords, or main patterns of words, were necessary to describe over 90% of the sequences (no more than two keywords per fragment in the κ and no more than five per fragment in the heavy chains). No identical keywords were found for different fragments of the variable domains. Keywords of aligned fragments of the VH and Vκ domains were different in all but two instances. Thus, knowing the keywords, one can determine whether any given small part of a sequence belongs to a heavy or κ chain and predict its precise localization in the sequence. In addition, by using all of the keywords obtained through analysis of the Kabat database, it was possible to describe completely the sequences of the human VH and Vκ germ-line segments.
Resumo:
The National Institute of Standards and Technology (NIST) has compiled and maintained a Short Tandem Repeat DNA Internet Database (http://www.cstl.nist.gov/biotech/strbase/) since 1997 commonly referred to as STRBase. This database is an information resource for the forensic DNA typing community with details on commonly used short tandem repeat (STR) DNA markers. STRBase consolidates and organizes the abundant literature on this subject to facilitate on-going efforts in DNA typing. Observed alleles and annotated sequence for each STR locus are described along with a review of STR analysis technologies. Additionally, commercially available STR multiplex kits are described, published polymerase chain reaction (PCR) primer sequences are reported, and validation studies conducted by a number of forensic laboratories are listed. To supplement the technical information, addresses for scientists and hyperlinks to organizations working in this area are available, along with the comprehensive reference list of over 1300 publications on STRs used for DNA typing purposes.
Resumo:
The adenylate uridylate-rich elements (AREs) mediate the rapid turnover of mRNAs encoding proteins that regulate cellular growth and body response to exogenous agents such as microbes, inflammatory and environmental stimuli. However, the full repertoire of ARE-containing mRNAs is unknown. Here, we explore the distribution of AREs in human mRNA sequences. Computational derivation of a 13-bp ARE pattern was performed using multiple expectation maximization for motif elicitations (MEME) and consensus analyses. This pattern was statistically validated for the specificity towards the 3′-untranslated region and not coding region. The computationally derived ARE pattern is the basis of a database which contains non-redundant full-length ARE-mRNAs. The ARE-mRNA database (ARED; http://rc.kfshrc.edu.sa/ared) reveals that ARE-mRNAs encode a wide repertoire of functionally diverse proteins that belong to different biological processes and are important in several disease states. Cluster analysis was performed using the ARE sequences to demonstrate potential relationships between the type and number of ARE motifs, and the functional characteristics of the proteins.
Resumo:
The Mouse Tumor Biology (MTB) Database serves as a curated, integrated resource for information about tumor genetics and pathology in genetically defined strains of mice (i.e., inbred, transgenic and targeted mutation strains). Sources of information for the database include the published scientific literature and direct data submissions by the scientific community. Researchers access MTB using Web-based query forms and can use the database to answer such questions as ‘What tumors have been reported in transgenic mice created on a C57BL/6J background?’, ‘What tumors in mice are associated with mutations in the Trp53 gene?’ and ‘What pathology images are available for tumors of the mammary gland regardless of genetic background?’. MTB has been available on the Web since 1998 from the Mouse Genome Informatics web site (http://www.informatics.jax.org). We have recently implemented a number of enhancements to MTB including new query options, redesigned query forms and results pages for pathology and genetic data, and the addition of an electronic data submission and annotation tool for pathology data.
Resumo:
The Ribosomal Database Project (RDP-II), previously described by Maidak et al. [Nucleic Acids Res. (2000), 28, 173–174], continued during the past year to add new rRNA sequences to the aligned data and to improve the analysis commands. Release 8.0 (June 1, 2000) consisted of 16 277 aligned prokaryotic small subunit (SSU) rRNA sequences while the number of eukaryotic and mitochondrial SSU rRNA sequences in aligned form remained at 2055 and 1503, respectively. The number of prokaryotic SSU rRNA sequences more than doubled from the previous release 14 months earlier, and ~75% are longer than 899 bp. An RDP-II mirror site in Japan is now available (http://wdcm.nig.ac.jp/RDP/html/index.html). RDP-II provides aligned and annotated rRNA sequences, derived phylogenetic trees and taxonomic hierarchies, and analysis services through its WWW server (http://rdp.cme.msu.edu/). Analysis services include rRNA probe checking, approximate phylogenetic placement of user sequences, screening user sequences for possible chimeric rRNA sequences, automated alignment, production of similarity matrices and services to plan and analyze terminal restriction fragment polymorphism experiments. The RDP-II email address for questions and comments has been changed from curator@cme.msu.edu to rdpstaff@msu.edu.
Resumo:
Upon the completion of the Saccharomyces cerevisiae genomic sequence in 1996 [Goffeau,A. et al. (1997) Nature, 387, 5], several creative and ambitious projects have been initiated to explore the functions of gene products or gene expression on a genome-wide scale. To help researchers take advantage of these projects, the Saccharomyces Genome Database (SGD) has created two new tools, Function Junction and Expression Connection. Together, the tools form a central resource for querying multiple large-scale analysis projects for data about individual genes. Function Junction provides information from diverse projects that shed light on the role a gene product plays in the cell, while Expression Connection delivers information produced by the ever-increasing number of microarray projects. WWW access to SGD is available at genome-www.stanford.edu/Saccharomyces/.
Resumo:
The Database of Interacting Proteins (DIP; http://dip.doe-mbi.ucla.edu) is a database that documents experimentally determined protein–protein interactions. Since January 2000 the number of protein–protein interactions in DIP has nearly tripled to 3472 and the number of proteins to 2659. New interactive tools have been developed to aid in the visualization, navigation and study of networks of protein interactions.
Resumo:
Aminoacyl-tRNA synthetases (AARSs) are at the center of the question of the origin of life. They constitute a family of enzymes integrating the two levels of cellular organization: nucleic acids and proteins. AARSs arose early in evolution and are believed to be a group of ancient proteins. They are responsible for attaching amino acid residues to their cognate tRNA molecules, which is the first step in the protein synthesis. The role they play in a living cell is essential for the precise deciphering of the genetic code. The analysis of AARSs evolutionary history was not possible for a long time due to a lack of a sufficiently large number of their amino acid sequences. The emerging picture of synthetases’ evolution is a result of recent achievements in genomics [Woese,C., Olsen,G.J., Ibba,M. and Söll,D. (2000) Microbiol. Mol. Biol. Rev., 64, 202–236]. In this paper we present a short introduction to the AARSs database. The updated database contains 1047 AARS primary structures from archaebacteria, eubacteria, mitochondria, chloroplasts and eukaryotic cells. It is the compilation of amino acid sequences of all AARSs known to date, which are available as separate entries via the WWW at http://biobase s.ibch.poznan.pl/aars/.
Resumo:
The Conserved Key Amino Acid Positions DataBase (CKAAPs DB) provides access to an analysis of structurally similar proteins with dissimilar sequences where key residues within a common fold are identified. The derivation and significance of CKAAPs starting from pairwise structure alignments is described fully in Reddy et al. [Reddy,B.V.B., Li,W.W., Shindyalov,I.N. and Bourne,P.E. (2000) Proteins, in press]. The CKAAPs identified from this theoretical analysis are provided to experimentalists and theoreticians for potential use in protein engineering and modeling. It has been suggested that CKAAPs may be crucial features for protein folding, structural stability and function. Over 170 substructures, as defined by the Combinatorial Extension (CE) database, which are found in approximately 3000 representative polypeptide chains have been analyzed and are available in the CKAAPs DB. CKAAPs DB also provides CKAAPs of the representative set of proteins derived from the CE and FSSP databases. Thus the database contains over 5000 representative polypeptide chains, covering all known structures in the PDB. A web interface to a relational database permits fast retrieval of structure-sequence alignments, CKAAPs and associated statistics. Users may query by PDB ID, protein name, function and Enzyme Classification number. Users may also submit protein alignments of their own to obtain CKAAPs. An interface to display CKAAPs on each structure from a web browser is also being implemented. CKAAPs DB is maintained by the San Diego Supercomputer Center and accessible at the URL http://ckaaps.sdsc.edu.
Resumo:
The database reported here is derived using the Combinatorial Extension (CE) algorithm which compares pairs of protein polypeptide chains and provides a list of structurally similar proteins along with their structure alignments. Using CE, structure–structure alignments can provide insights into biological function. When a protein of known function is shown to be structurally similar to a protein of unknown function, a relationship might be inferred; a relationship not necessarily detectable from sequence comparison alone. Establishing structure–structure relationships in this way is of great importance as we enter an era of structural genomics where there is a likelihood of an increasing number of structures with unknown functions being determined. Thus the CE database is an example of a useful tool in the annotation of protein structures of unknown function. Comparisons can be performed on the complete PDB or on a structurally representative subset of proteins. The source protein(s) can be from the PDB (updated monthly) or uploaded by the user. CE provides sequence alignments resulting from structural alignments and Cartesian coordinates for the aligned structures, which may be analyzed using the supplied Compare3D Java applet, or downloaded for further local analysis. Searches can be run from the CE web site, http://cl.sdsc.edu/ce.html, or the database and software downloaded from the site for local use.
Resumo:
Methylation of cytosine in the 5 position of the pyrimidine ring is a major modification of the DNA in most organisms. In eukaryotes, the distribution and number of 5-methylcytosines (5mC) along the DNA is heritable but can also change with the developmental state of the cell and as a response to modifications of the environment. While DNA methylation probably has a number of functions, scientific interest has recently focused on the gene silencing effect methylation can have in eukaryotic cells. In particular, the discovery of changes in the methylation level during cancer development has increased the interest in this field. In the past, a vast amount of data has been generated with different levels of resolution ranging from 5mC content of total DNA to the methylation status of single nucleotides. We present here a database for DNA methylation data that attempts to unify these results in a common resource. The database is accessible via WWW (http://www.methdb.de). It stores information about the origin of the investigated sample and the experimental procedure, and contains the DNA methylation data. Query masks allow for searching for 5mC content, species, tissue, gene, sex, phenotype, sequence ID and DNA type. The output lists all available information including the relative gene expression level. DNA methylation patterns and methylation profiles are shown both as a graphical representation and as G/A/T/C/5mC-sequences or tables with sequence positions and methylation levels, respectively.