30 resultados para GenBank


Relevância:

10.00% 10.00%

Publicador:

Resumo:

The Ribosomal RNA Operon Copy Number Database (rrndb) is an Internet-accessible database containing annotated information on rRNA operon copy number among prokaryotes. Gene redundancy is uncommon in prokaryotic genomes, yet the rRNA genes can vary from one to as many as 15 copies. Despite the widespread use of 16S rRNA gene sequences for identification of prokaryotes, information on the number and sequence of individual rRNA genes in a genome is not readily accessible. In an attempt to understand the evolutionary implications of rRNA operon redundancy, we have created a phylogenetically arranged report on rRNA gene copy number for a diverse collection of prokaryotic microorganisms. Each entry (organism) in the rrndb contains detailed information linked directly to external websites including the Ribosomal Database Project, GenBank, PubMed and several culture collections. Data contained in the rrndb will be valuable to researchers investigating microbial ecology and evolution using 16S rRNA gene sequences. The rrndb web site is directly accessible on the WWW at http://rrndb.cme.msu.edu.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

STACK is a tool for detection and visualisation of expressed transcript variation in the context of developmental and pathological states. The datasystem organises and reconstructs human transcripts from available public data in the context of expression state. The expression state of a transcript can include developmental state, pathological association, site of expression and isoform of expressed transcript. STACK consensus transcripts are reconstructed from clusters that capture and reflect the growing evidence of transcript diversity. The comprehensive capture of transcript variants is achieved by the use of a novel clustering approach that is tolerant of sub-sequence diversity and does not rely on pairwise alignment. This is in contrast with other gene indexing projects. STACK is generated at least four times a year and represents the exhaustive processing of all publicly available human EST data extracted from GenBank. This processed information can be explored through 15 tissue-specific categories, a disease-related category and a whole-body index and is accessible via WWW at http://www.sanbi.ac.za/Dbases.html. STACK represents a broadly applicable resource, as it is the only reconstructed transcript database for which the tools for its generation are also broadly available (http://www.sanbi.ac.za/CODES).

Relevância:

10.00% 10.00%

Publicador:

Resumo:

VIDA is a new virus database that organizes open reading frames (ORFs) from partial and complete genomic sequences from animal viruses. Currently VIDA includes all sequences from GenBank for Herpesviridae, Coronaviridae and Arteriviridae. The ORFs are organized into homologous protein families, which are identified on the basis of sequence similarity relationships. Conserved sequence regions of potential functional importance are identified and can be retrieved as sequence alignments. We use a controlled taxonomical and functional classification for all the proteins and protein families in the database. When available, protein structures that are related to the families have also been included. The database is available for online search and sequence information retrieval at http://www.biochem.ucl.ac.uk/bsm/virus_database/VIDA.html.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A database (SpliceDB) of known mammalian splice site sequences has been developed. We extracted 43 337 splice pairs from mammalian divisions of the gene-centered Infogene database, including sites from incomplete or alternatively spliced genes. Known EST sequences supported 22 815 of them. After discarding sequences with putative errors and ambiguous location of splice junctions the verified dataset includes 22 489 entries. Of these, 98.71% contain canonical GT–AG junctions (22 199 entries) and 0.56% have non-canonical GC–AG splice site pairs. The remainder (0.73%) occurs in a lot of small groups (with a maximum size of 0.05%). We especially studied non-canonical splice sites, which comprise 3.73% of GenBank annotated splice pairs. EST alignments allowed us to verify only the exonic part of splice sites. To check the conservative dinucleotides we compared sequences of human non-canonical splice sites with sequences from the high throughput genome sequencing project (HTG). Out of 171 human non-canonical and EST-supported splice pairs, 156 (91.23%) had a clear match in the human HTG. They can be classified after sequence analysis as: 79 GC–AG pairs (of which one was an error that corrected to GC–AG), 61 errors corrected to GT–AG canonical pairs, six AT–AC pairs (of which two were errors corrected to AT–AC), one case was produced from a non-existent intron, seven cases were found in HTG that were deposited to GenBank and finally there were only two other cases left of supported non-canonical splice pairs. The information about verified splice site sequences for canonical and non-canonical sites is presented in SpliceDB with the supporting evidence. We also built weight matrices for the major splice groups, which can be incorporated into gene prediction programs. SpliceDB is available at the computational genomic Web server of the Sanger Centre: http://genomic.sanger.ac.uk/spldb/SpliceDB.html and at http://www.softberry.com/spldb/SpliceDB.html.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The human prion gene contains five copies of a 24 nt repeat that is highly conserved among species. An analysis of folding free energies of the human prion mRNA, in particular in the repeat region, suggested biased codon selection and the presence of RNA patterns. In particular, pseudoknots, similar to the one predicted by Wills in the human prion mRNA, were identified in the repeat region of all available prion mRNAs available in GenBank, but not those of birds and the red slider turtle. An alignment of these mRNAs, which share low sequence homology, shows several co-variations that maintain the pseudoknot pattern. The presence of pseudoknots in yeast Sup35p and Rnq1 suggests acquisition in the prokaryotic era. Computer generated three-dimensional structures of the human prion pseudoknot highlight protein and RNA interaction domains, which suggest a possible effect in prion protein translation. The role of pseudoknots in prion diseases is discussed as individuals with extra copies of the 24 nt repeat develop the familial form of Creutzfeldt–Jakob disease.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The root hair is a specialized cell type involved in water and nutrient uptake in plants. In legumes the root hair is also the primary site of recognition and infection by symbiotic nitrogen-fixing Rhizobium bacteria. We have studied the root hairs of Medicago truncatula, which is emerging as an increasingly important model legume for studies of symbiotic nodulation. However, only 27 genes from M. truncatula were represented in GenBank/EMBL as of October, 1997. We report here the construction of a root-hair-enriched cDNA library and single-pass sequencing of randomly selected clones. Expressed sequence tags (899 total, 603 of which have homology to known genes) were generated and made available on the Internet. We believe that the database and the associated DNA materials will provide a useful resource to the community of scientists studying the biology of roots, root tips, root hairs, and nodulation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

DGq is the alpha subunit of the heterotrimeric GTPase (G alpha), which couples rhodopsin to phospholipase C in Drosophila vision. We have uncovered three duplicated exons in dgq by scanning the GenBank data base for unrecognized coding sequences. These alternative exons encode sites involved in GTPase activity and G beta-binding, NorpA (phospholipase C)-binding, and rhodopsin-binding. We examined the in vivo splicing of dgq in adult flies and find that, in all but the male gonads, only two isoforms are expressed. One, dgqA, is the original visual isoform and is expressed in eyes, ocelli, brain, and male gonads. The other, dgqB, has the three novel exons and is widely expressed. Remarkably, all three nonvisual B exons are highly similar (82% identity at the amino acid level) to the Gq alpha family consensus, from Caenorhabditis elegans to human, but all three visual A exons are divergent (61% identity). Intriguingly, we have found a third isoform, dgqC, which is specifically and abundantly expressed in male gonads, and shares the divergent rhodopsin-binding exon of dgqA. We suggest that DGqC is a candidate for the light-signal transducer of a testes-autonomous photosensory clock. This proposal is supported by the finding that rhodopsin 2 and arrestin 1, two photoreceptor-cell-specific genes, are also expressed in male gonads.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

One gene locus on chromosome I in Saccharomyces cerevisiae encodes a protein (YAB5_YEAST; accession no. P31378) with local sequence similarity to the DNA repair glycosylase endonuclease III from Escherichia coli. We have analyzed the function of this gene, now assigned NTG1 (endonuclease three-like glycosylase 1), by cloning, mutant analysis, and gene expression in E. coli. Targeted gene disruption of NTG1 produces a mutant that is sensitive to H2O2 and menadione, indicating that NTG1 is required for repair of oxidative DNA damage in vivo. Northern blot analysis and expression studies of a NTG1-lacZ gene fusion showed that NTG1 is induced by cell exposure to different DNA damaging agents, particularly menadione, and hence belongs to the DNA damage-inducible regulon in S. cerevisiae. When expressed in E. coli, the NTG1 gene product cleaves plasmid DNA damaged by osmium tetroxide, thus, indicating specificity for thymine glycols in DNA similarly as is the case for EndoIII. However, NTG1 also releases formamidopyrimidines from DNA with high efficiency and, hence, represents a glycosylase with a novel range of substrate recognition. Sequences similar to NTG1 from other eukaryotes, including Caenorhabditis elegans, Schizosaccharomyces pombe, and mammals, have recently been entered in the GenBank suggesting the universal presence of NTG1-like genes in higher organisms. S. cerevisiae NTG1 does not have the [4Fe-4S] cluster DNA binding domain characteristic of the other members of this family.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Pseudomonas aeruginosa, an opportunistic human pathogen, is a major causative agent of mortality and morbidity in immunocompromised patients and those with cystic fibrosis genetic disease. To identify new virulence genes of P. aeruginosa, a selection system was developed based on the in vivo expression technology (IVET) that was first reported in Salmonella system. An adenine-requiring auxotrophic mutant strain of P. aeruginosa was isolated and found avirulent on neutropenic mice. A DNA fragment that can complement the mutant strain, containing purEK operon that is required for de novo biosynthesis of purine, was sequenced and used in the IVET vector construction. By applying the IVET selection system to a neutropenic mouse infection model, genetic loci that are specifically induced in vivo were identified. Twenty-two such loci were partially sequenced and analyzed. One of them was a well-studied virulence factor, pyochelin receptor (FptA), that is involved in iron acquisition. Fifteen showed significant homology to reported sequences in GenBank, while the remaining six did not. One locus, designated np20, encodes an open reading frame that shares amino acid sequence homology to transcriptional regulators, especially to the ferric uptake regulator (Fur) proteins of other bacteria. An insertional np20 null mutant strain of P. aeruginosa did not show a growth defect on laboratory media; however, its virulence on neutropenic mice was significantly reduced compared with that of a wild-type parent strain, demonstrating the importance of the np20 locus in the bacterial virulence. The successful isolation of genetic loci that affect bacterial virulence demonstrates the utility of the IVET system in identification of new virulence genes of P. aeruginosa.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Nucleosomes, the basic structural elements of chromosomes, consist of 146 bp of DNA coiled around an octamer of histone proteins, and their presence can strongly influence gene expression. Considerations of the anisotropic flexibility of nucleotide triplets containing 3 cytosines or guanines suggested that a [5'(G/C)3 NN3']n motif might resist wrapping around a histone octamer. To test this, DNAs were constructed containing a 5'-CCGNN-3' pentanucleotide repeat with the Ns varied. Using in vitro nucleosome reconstitution and electron microscopy, a plasmid with 48 contiguous CCGNN repeats strongly excluded nucleosomes in the repeat region. Competitive reconstitution gel retardation experiments using DNA fragments containing 12, 24, or 48 CCGNN repeats showed that the propensity to exclude nucleosomes increased with the length of the repeat. Analysis showed that a 268-bp DNA containing a (CCGNN)48 block is 4.9 +/- 0.6-fold less efficient in nucleosome assembly than a similar length pUC19 fragment and approximately 78-fold less efficient than a similar length (CTG)n sequence, based on results from previous studies. Computer searches against the GenBank database for matches with a [(G/C)3NN]48 sequence revealed numerous examples that frequently were present in the control regions of "TATA-less" genes, including the human ETS-2 and human dihydrofolate reductase genes. In both cases the (G/C)3NN repeat, present in the promoter region, co-maps with loci previously shown to be nuclease hypersensitive sites.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Regulation of gene expression by zinc is well established, especially through the metal response elements of the metallothionein genes; however, most other aspects of the functions of zinc in gene expression remain unknown. We have looked for intestinal mRNAs that are regulated by dietary zinc status. Using the reverse transcriptase-PCR method of mRNA differential display, we compared intestinal mRNA from rats that were maintained for 18 days in one of three dietary groups: zinc-deficient, zinc-adequate, and pair-fed zinc-adequate. At the end of this period, total RNA was prepared from the intestine and analyzed by mRNA differential display. Under these conditions, only differentially displayed cDNA bands that varied in the zinc-deficient group, relative to the zinc-adequate groups, were selected. Utilizing two anchored oligo-dT3' PCR primers and a total of 27 arbitrary decamers as 5' PCR primers, our results yielded 47 differentially displayed cDNA bands from intestinal RNA. Thirty were increased in zinc deficiency, and 17 were decreased. Nineteen bands were subcloned and sequenced. Eleven of these were detectable on Northern blots, of which four were confirmed as regulated. Three of these have homology to known genes: cholecystokinin, uroguanylin, and ubiquinone oxidoreductase. The fourth is a novel sequence as it has no significant homology in GenBank. The remainder of those cloned included novel sequences, as well as matches to reported expressed sequence tags, and functionally identified genes. Further characterization of the regulated sequences identified here will show whether they are primary or secondary effects of zinc deficiency.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A human cDNA sequence homologous to human deoxycytidine kinase (dCK; EC 2.7.1.74) was identified in the GenBank sequence data base. The longest open reading frame encoded a protein that was 48% identical to dCK at the amino acid level. The cDNA was expressed in Escherichia coli and shown to encode a protein with the same substrate specificity as described for the mitochondrial deoxyguanosine kinase (dGK; EC 2.7.1.113). The N terminus of the deduced amino acid sequence had properties characteristic for a mitochondrial translocation signal, and cleavage at a putative mitochondrial peptidase cleavage site would give a mature protein size of 28 kDa. Northern blot analysis determined the length of dGK mRNA to 1.3 kbp with no cross-hybridization to the 2.8-kbp dCK mRNA. dGK mRNA was detected in all tissues investigated with the highest expression levels in muscle, brain, liver, and lymphoid tissues. Alignment of the dGK and herpes simplex virus type 1 thymidine kinase amino acid sequences showed that five regions, including the substrate-binding pocket and the ATP-binding glycine loop, were also conserved in dGK. To our knowledge, this is the first report of a cloned mitochondrial nucleoside kinase and the first demonstration of a general sequence homology between two mammalian deoxyribonucleoside kinases. Our findings suggest that dCK and dGK are evolutionarily related, as well as related to the family of herpes virus thymidine kinases.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Genes that are up- and down-regulated by thyroid hormone in the tail resorption program of Xenopus laevis have been isolated by a gene expression screen, sequenced, and identified in the GenBank data base. The entire program is estimated to consist of fewer than 35 up-regulated and fewer than 10 down-regulated genes; 17 and 4 of them, respectively, have been isolated and characterized. Up-regulated genes whose function can be predicted on the basis of their sequence include four transcription factors (including one of the thyroid hormone receptors), an extracellular matrix component (fibronectin) and membrane receptor (integrin), four proteinases, a deiodinase that degrades thyroid hormone, and a protein that binds the hypothalamic corticotropin-releasing factor, which has been implicated in controlling thyroid hormone synthesis in Xenopus tadpoles. All four down-regulated genes encode extracellular proteins that are expressed in tadpole epidermis. This survey of the program provides insights into the biology of metamorphosis.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We previously characterized a methionine aminopeptidase (EC 3.4.11.18; Met-AP1; also called peptidase M) in Saccharomyces cerevisiae, which differs from its prokaryotic homologues in that it (i) contains an N-terminal zinc-finger domain and (ii) does not produce lethality when disrupted, although it does slow growth dramatically; it is encoded by a gene called MAP1. Here we describe a second methionine aminopeptidase (Met-AP2) in S. cerevisiae, encoded by MAP2, which was cloned as a suppressor of the slow-growth phenotype of the map1 null strain. The DNA sequence of MAP2 encodes a protein of 421 amino acids that shows 22% identity with the sequence of yeast Met-AP1. Surprisingly, comparison with sequences in the GenBank data base showed that the product of MAP2 has even greater homology (55% identity) with rat p67, which was characterized as an initiation factor 2-associated protein but not yet shown to have Met-AP activity. Transformants of map1 null cells expressing MAP2 in a high-copy-number plasmid contained 3- to 12-fold increases in Met-AP activity on different peptide substrates. The epitope-tagged suppressor gene product was purified by immunoaffinity chromatography and shown to contain Met-AP activity. To evaluate the physiological significance of Met-AP2, the MAP2 gene was deleted from wild-type and map1 null yeast strains. The map2 null strain, like the map1 null strain, is viable but with a slower growth rate. The map1, map2 double-null strains are nonviable. Thus, removal of N-terminal methionine is an essential function in yeast, as in prokaryotes, but yeast require two methionine aminopeptidases to provide the essential function which can only be partially provided by Met-AP1 or Met-AP2 alone.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Simple sequence repeats (SSRs), consisting of tandemly repeated multiple copies of mono-, di-, tri-, or tetranucleotide motifs, are ubiquitous in eukaryotic genomes and are frequently used as genetic markers, taking advantage of their length polymorphism. We have examined the polymorphism of such sequences in the chloroplast genomes of plants, by using a PCR-based assay. GenBank searches identified the presence of several (dA)n.(dT)n mononucleotide stretches in chloroplast genomes. A chloroplast (cp) SSR was identified in three pine species (Pinus contorta, Pinus sylvestris, and Pinus thunbergii) 312 bp upstream of the psbA gene. DNA amplification of this repeated region from 11 pine species identified nine length variants. The polymorphic amplified fragments were isolated and the DNA sequence was determined, confirming that the length polymorphism was caused by variation in the length of the repeated region. In the pines, the chloroplast genome is transmitted through pollen and this PCR assay may be used to monitor gene flow in this genus. Analysis of 305 individuals from seven populations of Pinus leucodermis Ant. revealed the presence of four variants with intrapopulational diversities ranging from 0.000 to 0.629 and an average of 0.320. Restriction fragment length polymorphism analysis of cpDNA on the same populations previously failed to detect any variation. Population subdivision based on cpSSR was higher (Gst = 0.22, where Gst is coefficient of gene differentiation) than that revealed in a previous isozyme study (Gst = 0.05). We anticipate that SSR loci within the chloroplast genome should provide a highly informative assay for the analysis of the genetic structure of plant populations.