30 resultados para GenBank
em National Center for Biotechnology Information - NCBI
Resumo:
To get a better understanding of mutagenic mechanisms in humans, we have cloned and sequenced the human homolog of the Saccharomyces cerevisiae REV3 gene. The yeast gene encodes the catalytic subunit of DNA polymerase ζ, a nonessential enzyme that is thought to carry out translesion replication and is responsible for virtually all DNA damage-induced mutagenesis and the majority of spontaneous mutagenesis. The human gene encodes an expected protein of 3,130 residues, about twice the size of the yeast protein (1,504 aa). The two proteins are 29% identical in an amino-terminal region of ≈340 residues, 39% identical in a carboxyl-terminal region of ≈850 residues, and 29% identical in a 55-residue region in the middle of the two genes. The sequence of the expected protein strongly predicts that it is the catalytic subunit of a DNA polymerase of the pol ζ type; the carboxyl-terminal domain possesses, in the right order, the six motifs characteristic of eukaryotic DNA polymerases, most closely resembles yeast pol ζ among all polymerases in the GenBank database, and is different from the human α, δ, and ɛ enzymes. Human cells expressing high levels of an hsREV3 antisense RNA fragment grow normally, but show little or no UV-induced mutagenesis and are slightly more sensitive to killing by UV. The human gene therefore appears to carry out a function similar to that of its yeast counterpart.
Resumo:
A computational system for the prediction of polymorphic loci directly and efficiently from human genomic sequence was developed and verified. A suite of programs, collectively called pompous (polymorphic marker prediction of ubiquitous simple sequences) detects tandem repeats ranging from dinucleotides up to 250 mers, scores them according to predicted level of polymorphism, and designs appropriate flanking primers for PCR amplification. This approach was validated on an approximately 750-kilobase region of human chromosome 3p21.3, involved in lung and breast carcinoma homozygous deletions. Target DNA from 36 paired B lymphoblastoid and lung cancer lines was amplified and allelotyped for 33 loci predicted by pompous to be variable in repeat size. We found that among those 36 predominately Caucasian individuals 22 of the 33 (67%) predicted loci were polymorphic with an average heterozygosity of 0.42. Allele loss in this region was found in 27/36 (75%) of the tumor lines using these markers. pompous provides the genetic researcher with an additional tool for the rapid and efficient identification of polymorphic markers, and through a World Wide Web site, investigators can use pompous to identify polymorphic markers for their research. A catalog of 13,261 potential polymorphic markers and associated primer sets has been created from the analysis of 141,779,504 base pairs of human genomic sequence in GenBank. This data is available on our Web site (pompous.swmed.edu) and will be updated periodically as GenBank is expanded and algorithm accuracy is improved.
Resumo:
Spectrin is an important structural component of the plasma membrane skeleton. Heretofore-unidentified isoforms of spectrin also associate with Golgi and other organelles. We have discovered another member of the β-spectrin gene family by homology searches of the GenBank databases and by 5′ rapid amplification of cDNA ends of human brain cDNAs. Collectively, 7,938 nucleotides of contiguous clones are predicted to encode a 271,294-Da protein, called βIII spectrin, with conserved actin-, protein 4.1-, and ankyrin-binding domains, membrane association domains 1 and 2, a spectrin dimer self-association site, and a pleckstrin-homology domain. βIII spectrin transcripts are concentrated in the brain and present in the kidneys, liver, and testes and the prostate, pituitary, adrenal, and salivary glands. All of the tested tissues contain major 9.0-kb and minor 11.3-kb transcripts. The human βIII spectrin gene (SPTBN2) maps to chromosome 11q13 and the mouse gene (Spnb3) maps to a syntenic region close to the centromere on chromosome 19. Indirect immunofluorescence studies of cultured cells using antisera specific to human βIII spectrin reveal a Golgi-associated and punctate cytoplasmic vesicle-like distribution, suggesting that βIII spectrin associates with intracellular organelles. This distribution overlaps that of several Golgi and vesicle markers, including mannosidase II, p58, trans-Golgi network (TGN)38, and β-COP and is distinct from the endoplasmic reticulum markers calnexin and Bip. Liver Golgi membranes and other vesicular compartment markers cosediment in vitro with βIII spectrin. βIII spectrin thus constitutes a major component of the Golgi and vesicular membrane skeletons.
Resumo:
Exposure of human and rodent cells to a wide variety of chemoprotective compounds confers resistance against a broad set of carcinogens. For a subset of the chemoprotective compounds, protection is generated by an increase in the abundance of protective enzymes like glutathione S-transferases (GST). Antioxidant responsive elements (AREs) mediate the transcriptional induction of a battery of genes which comprise much of this chemoprotective response system. Past studies identified a necessary ARE “core” sequence of RTGACnnnGC, but this sequence alone is insufficient to mediate induction. In this study, the additional sequences necessary to define a sufficient, functional ARE are identified through systematic mutational analysis of the murine GST Ya ARE. Introduction of the newly identified necessary nucleotides into the regions flanking a nonresponsive, ARE-like, GST-Mu promoter sequence produced an inducible element. A screen of the GenBank database with the newly identified ARE consensus identified 16 genes which contained the functional ARE consensus sequence in their promoters. Included within this group was an ARE sequence from the murine ferritin-L promoter that mediated induction when tested. In an electrophoretic mobility-shift assay, the ferritin-L ARE was bound by ARE–binding protein 1, a protein previously identified as the likely mediator of the chemoprotective response. A three-level ARE classification system is presented to account for the distinct induction strengths observed in our mutagenesis studies. A model of the ARE as a composite regulatory site, where multiple transcription factors interact, is presented to account for the complex characteristics of ARE-mediated chemoprotective gene expression.
Resumo:
The final step in glycosylphosphatidylinositol (GPI) anchoring of cell surface proteins consists of a transamidation reaction in which preassembled GPI donors are substituted for C-terminal signal sequences in nascent polypeptides. In previous studies we described a human K562 cell mutant, termed class K, that accumulates fully assembled GPI units but is unable to transfer them to N-terminally processed proproteins. In further work we showed that, unlike wild-type microsomes, microsomes from these cells are unable to support C-terminal interaction of proproteins with the small nucleophiles hydrazine or hydroxylamine, and that the cells thus are defective in transamidation. In this study, using a modified recombinant vaccinia transient transfection system in conjunction with a composite cDNA prepared by 5′ extension of an existing GenBank sequence, we found that the genetic element affected in these cells corresponds to the human homolog of yGPI8, a gene affected in a yeast mutant strain exhibiting similar accumulation of GPI donors without transfer. hGPI8 gives rise to mRNAs of 1.6 and 1.9 kb, both encoding a protein of 395 amino acids that varies in cells with their ability to couple GPIs to proteins. The gene spans ≈25 kb of DNA on chromosome 1. Reconstitution of class K cells with hGPI8 abolishes their accumulation of GPI precursors and restores C-terminal processing of GPI-anchored proteins. Also, hGPI8 restores the ability of microsomes from the mutant cells to yield an active carbonyl in the presence of a proprotein which is considered to be an intermediate in catalysis by a transamidase.
Resumo:
The rpoH regulatory region of different members of the enteric bacteria family was sequenced or downloaded from GenBank and compared. In addition, the transcriptional start sites of rpoH of Yersinia frederiksenii and Proteus mirabilis, two distant members of this family, were determined. Sequences similar to the σ70 promoters P1, P4 and P5, to the σE promoter P3 and to boxes DnaA1, DnaA2, cAMP receptor protein (CRP) boxes CRP1, CRP2 and box CytR present in Escherichia coli K12, were identified in sequences of closely related bacteria such as: E.coli, Shigella flexneri, Salmonella enterica serovar Typhimurium, Citrobacter freundii, Enterobacter cloacae and Klebsiella pneumoniae. In more distant bacteria, Y.frederiksenii and P.mirabilis, the rpoH regulatory region has a distal P1-like σ70 promoter and two proximal promoters: a heat-induced σE-like promoter and a σ70 promoter. Sequences similar to the regulatory boxes were not identified in these bacteria. This study suggests that the general pattern of transcription of the rpoH gene in enteric bacteria includes a distal σ70 promoter, >200 nt upstream of the initiation codon, and two proximal promoters: a heat-induced σE-like promoter and a σ70 promoter. A second proximal σ70 promoter under catabolite-regulation is probably present only in bacteria closely related to E.coli.
Resumo:
The EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl/) is maintained at the European Bioinformatics Institute (EBI) in an international collaboration with the DNA Data Bank of Japan (DDBJ) and GenBank at the NCBI (USA). Data is exchanged amongst the collaborating databases on a daily basis. The major contributors to the EMBL database are individual authors and genome project groups. Webin is the preferred web-based submission system for individual submitters, whilst automatic procedures allow incorporation of sequence data from large-scale genome sequencing centres and from the European Patent Office (EPO). Database releases are produced quarterly. Network services allow free access to the most up-to-date data collection via ftp, email and World Wide Web interfaces. EBI’s Sequence Retrieval System (SRS), a network browser for databanks in molecular biology, integrates and links the main nucleotide and protein databases plus many specialized databases. For sequence similarity searching a variety of tools (e.g. Blitz, Fasta, BLAST) are available which allow external users to compare their own sequences against the latest data in the EMBL Nucleotide Sequence Database and SWISS-PROT.
Resumo:
FULL-malaria is a database for a full-length-enriched cDNA library from the human malaria parasite Plasmodium falciparum (http://133.11.149.55/). Because of its medical importance, this organism is the first target for genome sequencing of a eukaryotic pathogen; the sequences of two of its 14 chromosomes have already been determined. However, for the full exploitation of this rapidly accumulating information, correct identification of the genes and study of their expression are essential. Using the oligo-capping method, we have produced a full-length-enriched cDNA library from erythrocytic stage parasites and performed one-pass reading. The database consists of nucleotide sequences of 2490 random clones that include 390 (16%) known malaria genes according to BLASTN analysis of the nr-nt database in GenBank; these represent 98 genes, and the clones for 48 of these genes contain the complete protein-coding sequence (49%). On the other hand, comparisons with the complete chromosome 2 sequence revealed that 35 of 210 predicted genes are expressed, and in addition led to detection of three new gene candidates that were not previously known. In total, 19 of these 38 clones (50%) were full-length. From these observations, it is expected that the database contains ∼1000 genes, including 500 full-length clones. It should be an invaluable resource for the development of vaccines and novel drugs.
Resumo:
The IMGT/HLA Database (www.ebi.ac.uk/imgt/hla/) specialises in sequences of polymorphic genes of the HLA system, the human major histocompatibility complex (MHC). The HLA complex is located within the 6p21.3 region on the short arm of human chromosome 6 and contains more than 220 genes of diverse function. Many of the genes encode proteins of the immune system and these include the 21 highly polymorphic HLA genes, which influence the outcome of clinical transplantation and confer susceptibility to a wide range of non-infectious diseases. The database contains sequences for all HLA alleles officially recognised by the WHO Nomenclature Committee for Factors of the HLA System and provides users with online tools and facilities for their retrieval and analysis. These include allele reports, alignment tools and detailed descriptions of the source cells. The online IMGT/HLA submission tool allows both new and confirmatory sequences to be submitted directly to the WHO Nomenclature Committee. The latest version (release 1.7.0 July 2000) contains 1220 HLA alleles derived from over 2700 component sequences from the EMBL/GenBank/DDBJ databases. The HLA database provides a model which will be extended to provide specialist databases for polymorphic MHC genes of other species.
Resumo:
In response to a need for a general catalog of genome variation to address the large-scale sampling designs required by association studies, gene mapping and evolutionary biology, the National Center for Biotechnology Information (NCBI) has established the dbSNP database [S.T.Sherry, M.Ward and K.Sirotkin (1999) Genome Res., 9, 677–679]. Submissions to dbSNP will be integrated with other sources of information at NCBI such as GenBank, PubMed, LocusLink and the Human Genome Project data. The complete contents of dbSNP are available to the public at website: http://www.ncbi.nlm.nih.gov/SNP. The complete contents of dbSNP can also be downloaded in multiple formats via anonymous FTP at ftp://ncbi.nlm.nih.gov/snp/.
Resumo:
In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data analysis and retrieval resources that operate on the data in GenBank and a variety of other biological data made available through NCBI’s Web site. NCBI data retrieval resources include Entrez, PubMed, LocusLink and the Taxonomy Browser. Data analysis resources include BLAST, Electronic PCR, OrfFinder, RefSeq, UniGene, HomoloGene, Database of Single Nucleotide Polymorphisms (dbSNP), Human Genome Sequencing, Human MapViewer, GeneMap’99, Human–Mouse Homology Map, Cancer Chromosome Aberration Project (CCAP), Entrez Genomes, Clusters of Orthologous Groups (COGs) database, Retroviral Genotyping Tools, Cancer Genome Anatomy Project (CGAP), SAGEmap, Gene Expression Omnibus (GEO), Online Mendelian Inheritance in Man (OMIM), the Molecular Modeling Database (MMDB) and the Conserved Domain Database (CDD). Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at: http://www.ncbi.nlm.nih.gov.
Resumo:
The ARKdb genome databases provide comprehensive public repositories for genome mapping data from farmed species and other animals (http://www.thearkdb.org) providing a resource similar in function to that offered by GDB or MGD for human or mouse genome mapping data, respectively. Because we have attempted to build a generic mapping database, the system has wide utility, particularly for those species for which development of a specific resource would be prohibitive. The ARKdb genome database model has been implemented for 10 species to date. These are pig, chicken, sheep, cattle, horse, deer, tilapia, cat, turkey and salmon. Access to the ARKdb databases is effected via the World Wide Web using the ARKdb browser and Anubis map viewer. The information stored includes details of loci, maps, experimental methods and the source references. Links to other information sources such as PubMed and EMBL/GenBank are provided. Responsibility for data entry and curation is shared amongst scientists active in genome research in the species of interest. Mirror sites in the United States are maintained in addition to the central genome server at Roslin.
Resumo:
While genome sequencing projects are advancing rapidly, EST sequencing and analysis remains a primary research tool for the identification and categorization of gene sequences in a wide variety of species and an important resource for annotation of genomic sequence. The TIGR Gene Indices (http://www.tigr.org/tdb/tgi.shtml) are a collection of species-specific databases that use a highly refined protocol to analyze EST sequences in an attempt to identify the genes represented by that data and to provide additional information regarding those genes. Gene Indices are constructed by first clustering, then assembling EST and annotated gene sequences from GenBank for the targeted species. This process produces a set of unique, high-fidelity virtual transcripts, or Tentative Consensus (TC) sequences. The TC sequences can be used to provide putative genes with functional annotation, to link the transcripts to mapping and genomic sequence data, to provide links between orthologous and paralogous genes and as a resource for comparative sequence analysis.
Resumo:
The objective of database AsMamDB is to facilitate the systematic study of alternatively spliced genes of mammals. Version 1.0 of AsMamDB contains 1563 alternatively spliced genes of human, mouse and rat, each associated with a cluster of nucleotide sequences. The main information provided by AsMamDB includes gene alternative splicing patterns, gene structures, locations in chromosomes, products of genes and tissues where they express. Alternative splicing patterns are represented by multiple alignments of various gene transcripts and by graphs of their topological structures. Gene structures are illustrated by exon, intron and various regulatory elements distributions. There are 4204 DNAs, 3977 mRNAs, 8989 CDSs and 126 931 ESTs in the current database. More than 130 000 GenBank entries are covered and 4443 MEDLINE records are linked. DNA, mRNA, exon, intron and relevant regulatory element sequences are provided in FASTA format. More information can be obtained by using the web-based multiple alignment tool Asalign and various category lists. AsMamDB can be accessed at http://166.111.30.6 5/ASMAM DB.html.