89 resultados para Klebsiella pneumoniae genome sequence
Resumo:
The EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl/) is maintained at the European Bioinformatics Institute (EBI) in an international collaboration with the DNA Data Bank of Japan (DDBJ) and GenBank at the NCBI (USA). Data is exchanged amongst the collaborating databases on a daily basis. The major contributors to the EMBL database are individual authors and genome project groups. Webin is the preferred web-based submission system for individual submitters, whilst automatic procedures allow incorporation of sequence data from large-scale genome sequencing centres and from the European Patent Office (EPO). Database releases are produced quarterly. Network services allow free access to the most up-to-date data collection via ftp, email and World Wide Web interfaces. EBI’s Sequence Retrieval System (SRS), a network browser for databanks in molecular biology, integrates and links the main nucleotide and protein databases plus many specialized databases. For sequence similarity searching a variety of tools (e.g. Blitz, Fasta, BLAST) are available which allow external users to compare their own sequences against the latest data in the EMBL Nucleotide Sequence Database and SWISS-PROT.
Resumo:
The Mouse Genome Database (MGD) is the community database resource for the laboratory mouse, a key model organism for interpreting the human genome and for understanding human biology and disease (http://www.informatics.jax.org). MGD provides standard nomenclature and consensus map positions for mouse genes and genetic markers; it provides a curated set of mammalian homology records, user-defined chromosomal maps, experimental data sets and the definitive mouse ‘gene to sequence’ reference set for the research community. The integration and standardization of these data sets facilitates the transition between mouse DNA sequence, gene and phenotype annotations. A recent focus on allele and phenotype representations enhances the ability of MGD to organize and present data for exploring the relationship between genotype and phenotype. This link between the genome and the biology of the mouse is especially important as phenotype information grows from large mutagenesis projects and genotype information grows from large-scale sequencing projects.
Resumo:
Upon the completion of the Saccharomyces cerevisiae genomic sequence in 1996 [Goffeau,A. et al. (1997) Nature, 387, 5], several creative and ambitious projects have been initiated to explore the functions of gene products or gene expression on a genome-wide scale. To help researchers take advantage of these projects, the Saccharomyces Genome Database (SGD) has created two new tools, Function Junction and Expression Connection. Together, the tools form a central resource for querying multiple large-scale analysis projects for data about individual genes. Function Junction provides information from diverse projects that shed light on the role a gene product plays in the cell, while Expression Connection delivers information produced by the ever-increasing number of microarray projects. WWW access to SGD is available at genome-www.stanford.edu/Saccharomyces/.
Resumo:
VIDA is a new virus database that organizes open reading frames (ORFs) from partial and complete genomic sequences from animal viruses. Currently VIDA includes all sequences from GenBank for Herpesviridae, Coronaviridae and Arteriviridae. The ORFs are organized into homologous protein families, which are identified on the basis of sequence similarity relationships. Conserved sequence regions of potential functional importance are identified and can be retrieved as sequence alignments. We use a controlled taxonomical and functional classification for all the proteins and protein families in the database. When available, protein structures that are related to the families have also been included. The database is available for online search and sequence information retrieval at http://www.biochem.ucl.ac.uk/bsm/virus_database/VIDA.html.
Resumo:
Viruses with RNA genomes often capture and redirect host cell components to assist in mechanisms particular to RNA-dependent RNA synthesis. The nidoviruses are an order of positive-stranded RNA viruses, comprising coronaviruses and arteriviruses, that employ a unique strategy of discontinuous transcription, producing a series of subgenomic mRNAs linking a 5′ leader to distal portions of the genome. For the prototype coronavirus mouse hepatitis virus (MHV), heterogeneous nuclear ribonucleoprotein (hnRNP) A1 has been shown to be able to bind in vitro to the negative strand of the intergenic sequence, a cis-acting element found in the leader RNA and preceding each downstream ORF in the genome. hnRNP A1 thus has been proposed as a host factor in MHV transcription. To test this hypothesis genetically, we initially constructed MHV mutants with a very high-affinity hnRNP A1 binding site inserted in place of, or adjacent to, an intergenic sequence in the MHV genome. This inserted hnRNP A1 binding site was not able to functionally replace, or enhance transcription from, the intergenic sequence. This finding led us to test more directly the role of hnRNP A1 by analysis of MHV replication and RNA synthesis in a murine cell line that does not express this protein. The cellular absence of hnRNP A1 had no detectable effect on the production of infectious virus, the synthesis of genomic RNA, or the quantity or quality of subgenomic mRNAs. These results strongly suggest that hnRNP A1 is not a required host factor for MHV discontinuous transcription or genome replication.
Resumo:
Reovirus genome segment S1 encodes protein σ1, which is the receptor binding protein, modulates tissue tropism, and specifies the nature of the antiviral immune response. It makes up less than 2% of reovirus particles and is synthesized in very small amounts in infected cells. Any antiviral strategy aimed at reducing specifically the expression of this genome segment should, in principle, reduce the infectivity of the virus. To test this hypothesis, we have assembled two hammer-head motif-containing ribozymes (Rzs) targeted to cleave at the conserved B and C domains of the reovirus s1 RNA. Protein-independent but Mg2+-dependent sequence-specific cleavage of s1 RNA was achieved by both the Rzs in trans. Cells that transiently express these Rzs, when challenged with reovirus, were protected against the cytopathic effects caused by the virus. This protection correlated with the specific intracellular reduction of s1 transcripts that was due to their cleavage by the Rzs. Rz-treated cells that were challenged with reovirus showed almost complete disappearance of protein σ1 without significantly altering the levels of the other reovirus structural proteins. Thus, Rzs, besides acting as antiviral agents, could be exploited as biological tools to delineate specific functions of target genes.
Resumo:
The release of vast quantities of DNA sequence data by large-scale genome and expressed sequence tag (EST) projects underlines the necessity for the development of efficient and inexpensive ways to link sequence databases with temporal and spatial expression profiles. Here we demonstrate the power of linking cDNA sequence data (including EST sequences) with transcript profiles revealed by cDNA-AFLP, a highly reproducible differential display method based on restriction enzyme digests and selective amplification under high stringency conditions. We have developed a computer program (GenEST) that predicts the sizes of virtual transcript-derived fragments (TDFs) of in silico-digested cDNA sequences retrieved from databases. The vast majority of the resulting virtual TDFs could be traced back among the thousands of TDFs displayed on cDNA-AFLP gels. Sequencing of the corresponding bands excised from cDNA-AFLP gels revealed no inconsistencies. As a consequence, cDNA sequence databases can be screened very efficiently to identify genes with relevant expression profiles. The other way round, it is possible to switch from cDNA-AFLP gels to sequences in the databases. Using the restriction enzyme recognition sites, the primer extensions and the estimated TDF size as identifiers, the DNA sequence(s) corresponding to a TDF with an interesting expression pattern can be identified. In this paper we show examples in both directions by analyzing the plant parasitic nematode Globodera rostochiensis. Various novel pathogenicity factors were identified by combining ESTs from the infective stage juveniles with expression profiles of ∼4000 genes in five developmental stages produced by cDNA-AFLP.
Resumo:
Progress in agricultural and environmental technologies is hampered by a slower rate of gene discovery in plants than animals. The vast pool of genes in plants, however, will be an important resource for insertion of genes, via biotechnological procedures, into an array of plants, generating unique germ plasms not achievable by conventional breeding. It just became clear that genomes of grasses have evolved in a manner analogous to Lego blocks. Large chromosome segments have been reshuffled and stuffer pieces added between genes. Although some genomes have become very large, the genome with the fewest stuffer pieces, the rice genome, is the Rosetta Stone of all the bigger grass genomes. This means that sequencing the rice genome as anchor genome of the grasses will provide instantaneous access to the same genes in the same relative physical position in other grasses (e.g., corn and wheat), without the need to sequence each of these genomes independently. (i) The sequencing of the entire genome of rice as anchor genome for the grasses will accelerate plant gene discovery in many important crops (e.g., corn, wheat, and rice) by several orders of magnitudes and reduce research and development costs for government and industry at a faster pace. (ii) Costs for sequencing entire genomes have come down significantly. Because of its size, rice is only 12% of the human or the corn genome, and technology improvements by the human genome project are completely transferable, translating in another 50% reduction of the costs. (iii) The physical mapping of the rice genome by a group of Japanese researchers provides a jump start for sequencing the genome and forming an international consortium. Otherwise, other countries would do it alone and own proprietary positions.
Resumo:
Since 1991, the Rice Genome Research Program in Japan has carried out rice genomics, such as large-scale cDNA analysis, construction of a fine-scale restriction fragment length polymorphism map, and physical mapping of the rice genome with yeast artificial chromosome clones. These studies have made a great impact on research into grass genomes and made rice a model plant for other cereal crop research. Starting in 1998, the Rice Genome Research Program will step into a new stage of genomics—that of genome sequencing. This project eventually should reveal all of the genomic sequence information in the rice plant and be an indispensable aid in understanding the genomics of other grass species.
Resumo:
The determination of complete genome sequences provides us with an opportunity to describe and analyze evolution at the comprehensive level of genomes. Here we compare nine genomes with respect to their protein coding genes at two levels: (i) we compare genomes as “bags of genes” and measure the fraction of orthologs shared between genomes and (ii) we quantify correlations between genes with respect to their relative positions in genomes. Distances between the genomes are related to their divergence times, measured as the number of amino acid substitutions per site in a set of 34 orthologous genes that are shared among all the genomes compared. We establish a hierarchy of rates at which genomes have changed during evolution. Protein sequence identity is the most conserved, followed by the complement of genes within the genome. Next is the degree of conservation of the order of genes, whereas gene regulation appears to evolve at the highest rate. Finally, we show that some genomes are more highly organized than others: they show a higher degree of the clustering of genes that have orthologs in other genomes.
Resumo:
A whole genome cattle-hamster radiation hybrid cell panel was used to construct a map of 54 markers located on bovine chromosome 5 (BTA5). Of the 54 markers, 34 are microsatellites selected from the cattle linkage map and 20 are genes. Among the 20 mapped genes, 10 are new assignments that were made by using the comparative mapping by annotation and sequence similarity strategy. A LOD-3 radiation hybrid framework map consisting of 21 markers was constructed. The relatively low retention frequency of markers on this chromosome (19%) prevented unambiguous ordering of the other 33 markers. The length of the map is 398.7 cR, corresponding to a ratio of ≈2.8 cR5,000/cM. Type I genes were binned for comparison of gene order among cattle, humans, and mice. Multiple internal rearrangements within conserved syntenic groups were apparent upon comparison of gene order on BTA5 and HSA12 and HSA22. A similarly high number of rearrangements were observed between BTA5 and MMU6, MMU10, and MMU15. The detailed comparative map of BTA5 should facilitate identification of genes affecting economically important traits that have been mapped to this chromosome and should contribute to our understanding of mammalian chromosome evolution.
Resumo:
Following transcription and splicing, each mRNA of a mammalian cell passes into the cytoplasm where its fate is in the hands of a complex network of ribonucleoproteins (mRNPs). The success or failure of a gene to be expressed depends on the performance of this mRNP infrastructure. The entry, gating, processing, and transit of each mRNA through an mRNP network helps determine the composition of a cell's proteome. The machinery that regulates storage, turnover, and translational activation of mRNAs is not well understood, in part, because of the heterogeneous nature of mRNPs. Recently, subsets of cellular mRNAs clustered as members of mRNP complexes have been identified by using antibodies reactive with RNA-binding proteins, including ELAV/Hu, eIF-4E, and poly(A)-binding proteins. Cytoplasmic ELAV/Hu proteins are involved in the stability and translation of early response gene (ERG) transcripts and are expressed predominately in neurons. mRNAs recovered from ELAV/Hu mRNP complexes were found to have similar sequence elements, suggesting a common structural linkage among them. This approach opens the possibility of identifying transcripts physically clustered in vivo that may have similar fates or functions. Moreover, the proteins encoded by physically organized mRNAs may participate in the same biological process or structural outcome, not unlike operons and their polycistronic mRNAs do in prokaryotic organisms. Our goal is to understand the organization and flow of genetic information on an integrative systems level by analyzing the collective properties of proteins and mRNAs associated with mRNPs in vivo.
Resumo:
The genome of the crenarchaeon Sulfolobus solfataricus P2 contains 2,992,245 bp on a single chromosome and encodes 2,977 proteins and many RNAs. One-third of the encoded proteins have no detectable homologs in other sequenced genomes. Moreover, 40% appear to be archaeal-specific, and only 12% and 2.3% are shared exclusively with bacteria and eukarya, respectively. The genome shows a high level of plasticity with 200 diverse insertion sequence elements, many putative nonautonomous mobile elements, and evidence of integrase-mediated insertion events. There are also long clusters of regularly spaced tandem repeats. Different transfer systems are used for the uptake of inorganic and organic solutes, and a wealth of intracellular and extracellular proteases, sugar, and sulfur metabolizing enzymes are encoded, as well as enzymes of the central metabolic pathways and motility proteins. The major metabolic electron carrier is not NADH as in bacteria and eukarya but probably ferredoxin. The essential components required for DNA replication, DNA repair and recombination, the cell cycle, transcriptional initiation and translation, but not DNA folding, show a strong eukaryal character with many archaeal-specific features. The results illustrate major differences between crenarchaea and euryarchaea, especially for their DNA replication mechanism and cell cycle processes and their translational apparatus.
Resumo:
The psbA2 gene of a unicellular cyanobacterium, Microcystis aeruginosa K-81, encodes a D1 protein homolog in the reaction center of photosynthetic Photosystem II. The expression of the psbA2 transcript has been shown to be light-dependent as assessed under light and dark (12/12 h) cycling conditions. We aligned the 5′-untranslated leader regions (UTRs) of psbAs from different photosynthetic organisms and identified a conserved sequence, UAAAUAAA or the ‘AU-box’, just upstream of the SD sequences. To clarify the role of 5′-upstream cis-elements containing the AU-box for light-dependent expression of psbA2, a series of deletion and point mutations in the region were introduced into the genome of heterologous cyanobacterium Synechococcus sp. strain PCC 7942, and psbA2 expression was examined. A clear pattern of light-dependent expression was observed in recombinant cyanobacteria carrying the K-81 psbA2 –38/+36 region (which includes the minimal promoter element and a light-dependent cis-element with the AU-box), +1 indicating the transcription start site. A constitutive pattern of expression, in which the transcripts remained almost stable under dark conditions, was obtained in cells harboring the –38/+14 region (the minimal element), indicating that the +14/+36 region with the AU-box is important for the observed light-dependent expression. Point mutations analyses within the AU-box also revealed that changes in number, direction and identity (as assayed by adenine/uridine nucleotide substitutions) influenced the light-dependent pattern of expression. The level of psbA2 transcripts increased markedly in CG- or deletion-box mutants in the dark, strongly indicating that the AU- (AT-) box acts as a negative cis-element. Furthermore, characterization of transcript accumulation in cells treated with rifampicin suggests that psbA2 5′-mRNA is unstable in the dark, supporting the view that the light-dependent expression is controlled at the post-transcriptional level. We discuss various mechanisms that may lead to altered mRNA stability such as the binding of factor(s) or ribosomes to the 5′-UTR and possible roles of the AU-box motif and the SD sequence.
Resumo:
We have analyzed the developmental molecular programs of the mouse hippocampus, a cortical structure critical for learning and memory, by means of large-scale DNA microarray techniques. Of 11,000 genes and expressed sequence tags examined, 1,926 showed dynamic changes during hippocampal development from embryonic day 16 to postnatal day 30. Gene-cluster analysis was used to group these genes into 16 distinct clusters with striking patterns that appear to correlate with major developmental hallmarks and cellular events. These include genes involved in neuronal proliferation, differentiation, and synapse formation. A complete list of the transcriptional changes has been compiled into a comprehensive gene profile database (http://BrainGenomics.Princeton.edu), which should prove valuable in advancing our understanding of the molecular and genetic programs underlying both the development and the functions of the mammalian brain.