43 resultados para Information Gene
Resumo:
There is still a lack of information on the specific characteristics of DNA-binding proteins from hyperthermophiles. Here we report on the product of the gene orf56 from plasmid pRN1 of the acidophilic and thermophilic archaeon Sulfolobus islandicus. orf56 has not been characterised yet but low sequence similarily to several eubacterial plasmid-encoded genes suggests that this 6.5 kDa protein is a sequence-specific DNA-binding protein. The DNA-binding properties of ORF56, expressed in Escherichia coli, have been investigated by EMSA experiments and by fluorescence anisotropy measurements. Recombinant ORF56 binds to double-stranded DNA, specifically to an inverted repeat located within the promoter of orf56. Binding to this site could down-regulate transcription of the orf56 gene and also of the overlapping orf904 gene, encoding the putative initiator protein of plasmid replication. By gel filtration and chemical crosslinking we have shown that ORF56 is a dimeric protein. Stoichiometric fluorescence anisotropy titrations further indicate that ORF56 binds as a tetramer to the inverted repeat of its target binding site. CD spectroscopy points to a significant increase in ordered secondary structure of ORF56 upon binding DNA. ORF56 binds without apparent cooperativity to its target DNA with a dissociation constant in the nanomolar range. Quantitative analysis of binding isotherms performed at various salt concentrations and at different temperatures indicates that approximately seven ions are released upon complex formation and that complex formation is accompanied by a change in heat capacity of –6.2 kJ/mol. Furthermore, recombinant ORF56 proved to be highly thermostable and is able to bind DNA up to 85°C.
Resumo:
Here we present the successful application of the microarray technology platform to the analysis of DNA polymorphisms. Using the rice genome as a model, we demonstrate the potential of a high-throughput genome analysis method called Diversity Array Technology, DArT‘. In the format presented here the technology is assaying for the presence (or amount) of a specific DNA fragment in a representation derived from the total genomic DNA of an organism or a population of organisms. Two different approaches are presented: the first involves contrasting two representations on a single array while the second involves contrasting a representation with a reference DNA fragment common to all elements of the array. The Diversity Panels created using this method allow genetic fingerprinting of any organism or group of organisms belonging to the gene pool from which the panel was developed. Diversity Arrays enable rapid and economical application of a highly parallel, solid-state genotyping technology to any genome or complex genomic mixtures.
Resumo:
Chromosome 7q22 has been the focus of many cytogenetic and molecular studies aimed at delineating regions commonly deleted in myeloid leukemias and myelodysplastic syndromes. We have compared a gene-dense, GC-rich sub-region of 7q22 with the orthologous region on mouse chromosome 5. A physical map of 640 kb of genomic DNA from mouse chromosome 5 was derived from a series of overlapping bacterial artificial chromosomes. A 296 kb segment from the physical map, spanning Ache to Tfr2, was compared with 267 kb of human sequence. We identified a conserved linkage of 12 genes including an open reading frame flanked by Ache and Asr2, a novel cation-chloride cotransporter interacting protein Cip1, Ephb4, Zan and Perq1. While some of these genes have been previously described, in each case we present new data derived from our comparative sequence analysis. Adjacent unfinished sequence data from the mouse contains an orthologous block of 10 additional genes including three novel cDNA sequences that we subsequently mapped to human 7q22. Methods for displaying comparative genomic information, including unfinished sequence data, are becoming increasingly important. We supplement our printed comparative analysis with a new, Web-based program called Laj (local alignments with java). Laj provides interactive access to archived pairwise sequence alignments via the WWW. It displays synchronized views of a dot-plot, a percent identity plot, a nucleotide-level local alignment and a variety of relevant annotations. Our mouse–human comparison can be viewed at http://web.uvic.ca/~bioweb/laj.html. Laj is available at http://bio.cse.psu.edu/, along with online documentation and additional examples of annotated genomic regions.
Resumo:
Thousands of genes have been painstakingly identified and characterized a few genes at a time. Many thousands more are being predicted by large scale cDNA and genomic sequencing projects, with levels of evidence ranging from supporting mRNA sequence and comparative genomics to computing ab initio models. This, coupled with the burgeoning scientific literature, makes it critical to have a comprehensive directory for genes and reference sequences for key genomes. The NCBI provides two resources, LocusLink and RefSeq, to meet these needs. LocusLink organizes information around genes to generate a central hub for accessing gene-specific information for fruit fly, human, mouse, rat and zebrafish. RefSeq provides reference sequence standards for genomes, transcripts and proteins; human, mouse and rat mRNA RefSeqs, and their corresponding proteins, are discussed here. Together, RefSeq and LocusLink provide a non-redundant view of genes and other loci to support research on genes and gene families, variation, gene expression and genome annotation. Additional information about LocusLink and RefSeq is available at http://www.ncbi.nlm.nih.gov/LocusLink/.
Resumo:
Upon the completion of the Saccharomyces cerevisiae genomic sequence in 1996 [Goffeau,A. et al. (1997) Nature, 387, 5], several creative and ambitious projects have been initiated to explore the functions of gene products or gene expression on a genome-wide scale. To help researchers take advantage of these projects, the Saccharomyces Genome Database (SGD) has created two new tools, Function Junction and Expression Connection. Together, the tools form a central resource for querying multiple large-scale analysis projects for data about individual genes. Function Junction provides information from diverse projects that shed light on the role a gene product plays in the cell, while Expression Connection delivers information produced by the ever-increasing number of microarray projects. WWW access to SGD is available at genome-www.stanford.edu/Saccharomyces/.
Resumo:
In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data analysis and retrieval resources that operate on the data in GenBank and a variety of other biological data made available through NCBI’s Web site. NCBI data retrieval resources include Entrez, PubMed, LocusLink and the Taxonomy Browser. Data analysis resources include BLAST, Electronic PCR, OrfFinder, RefSeq, UniGene, HomoloGene, Database of Single Nucleotide Polymorphisms (dbSNP), Human Genome Sequencing, Human MapViewer, GeneMap’99, Human–Mouse Homology Map, Cancer Chromosome Aberration Project (CCAP), Entrez Genomes, Clusters of Orthologous Groups (COGs) database, Retroviral Genotyping Tools, Cancer Genome Anatomy Project (CGAP), SAGEmap, Gene Expression Omnibus (GEO), Online Mendelian Inheritance in Man (OMIM), the Molecular Modeling Database (MMDB) and the Conserved Domain Database (CDD). Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at: http://www.ncbi.nlm.nih.gov.
Resumo:
The TRANSFAC database on transcription factors and their DNA-binding sites and profiles (http://www.gene-regulation.de/) has been quantitatively extended and supplemented by a number of modules. These modules give information about pathologically relevant mutations in regulatory regions and transcription factor genes (PathoDB), scaffold/matrix attached regions (S/MARt DB), signal transduction (TRANSPATH) and gene expression sources (CYTOMER). Altogether, these distinct database modules constitute the TRANSFAC system. They are accompanied by a number of program routines for identifying potential transcription factor binding sites or for localizing individual components in the regulatory network of a cell.
Resumo:
While genome sequencing projects are advancing rapidly, EST sequencing and analysis remains a primary research tool for the identification and categorization of gene sequences in a wide variety of species and an important resource for annotation of genomic sequence. The TIGR Gene Indices (http://www.tigr.org/tdb/tgi.shtml) are a collection of species-specific databases that use a highly refined protocol to analyze EST sequences in an attempt to identify the genes represented by that data and to provide additional information regarding those genes. Gene Indices are constructed by first clustering, then assembling EST and annotated gene sequences from GenBank for the targeted species. This process produces a set of unique, high-fidelity virtual transcripts, or Tentative Consensus (TC) sequences. The TC sequences can be used to provide putative genes with functional annotation, to link the transcripts to mapping and genomic sequence data, to provide links between orthologous and paralogous genes and as a resource for comparative sequence analysis.
Resumo:
The Gene Expression Database (GXD) is a community resource of gene expression information for the laboratory mouse. By combining the different types of expression data, GXD aims to provide increasingly complete information about the expression profiles of genes in different mouse strains and mutants, thus enabling valuable insights into the molecular networks that underlie normal development and disease. GXD is integrated with the Mouse Genome Database (MGD). Extensive interconnections with sequence databases and with databases from other species, and the development and use of shared controlled vocabularies extend GXD’s utility for the analysis of gene expression information. GXD is accessible through the Mouse Genome Informatics web site at http://www.informatic s.jax.org/ or directly at http://www.informatics.jax.org/me nus/expression_menu.shtml.
Resumo:
There is no control over the information provided with sequences when they are deposited in the sequence databases. Consequently mistakes can seed the incorrect annotation of other sequences. Grouping genes into families and applying controlled annotation overcomes the problems of incorrect annotation associated with individual sequences. Two databases (http://www.mendel.ac.uk) were created to apply controlled annotation to plant genes and plant ESTs: Mendel-GFDb is a database of plant protein (gene) families based on gapped-BLAST analysis of all sequences in the SWISS-PROT family of databases. Sequences are aligned (ClustalW) and identical and similar residues shaded. The families are visually curated to ensure that one or more criteria, for example overall relatedness and/or domain similarity relate all sequences within a family. Sequence families are assigned a ‘Gene Family Number’ and a unified description is developed which best describes the family and its members. If authority exists the gene family is assigned a ‘Gene Family Name’. This information is placed in Mendel-GFDb. Mendel-ESTS is primarily a database of plant ESTs, which have been compared to Mendel-GFDb, completely sequenced genomes and domain databases. This approach associated ESTs with individual sequences and the controlled annotation of gene families and protein domains; the information being placed in Mendel-ESTS. The controlled annotation applied to genes and ESTs provides a basis from which a plant transcription database can be developed.
Resumo:
We performed a genome-wide analysis of gene expression in primary human CD15+ myeloid progenitor cells. By using the serial analysis of gene expression (SAGE) technique, we obtained quantitative information for the expression of 37,519 unique SAGE-tag sequences. Of these unique tags, (i) 25% were detected at high and intermediate levels, whereas 75% were present as single copies, (ii) 53% of the tags matched known expressed sequences, 34% of which were matched to more than one known expressed sequence, and (iii) 47% of the tags had no matches and represent potentially novel genes. The correct genes were confirmed by application of the generation of longer cDNA fragments from SAGE tags for gene identification (GLGI) technique for high-copy tags with multiple matches. A set of genes known to be important in myeloid differentiation were expressed at various levels and used different spliced forms. This study provides a normal baseline for comparison of gene expression in myeloid diseases. The strategy of using SAGE and GLGI techniques in this study has broad applications to the genome-wide identification of expressed genes.
Resumo:
Defects in the XPG DNA repair endonuclease gene can result in the cancer-prone disorders xeroderma pigmentosum (XP) or the XP–Cockayne syndrome complex. While the XPG cDNA sequence was known, determination of the genomic sequence was required to understand its different functions. In cells from normal donors, we found that the genomic sequence of the human XPG gene spans 30 kb, contains 15 exons that range from 61 to 1074 bp and 14 introns that range from 250 to 5763 bp. Analysis of the splice donor and acceptor sites using an information theory-based approach revealed three splice sites with low information content, which are components of the minor (U12) spliceosome. We identified six alternatively spliced XPG mRNA isoforms in cells from normal donors and from XPG patients: partial deletion of exon 8, partial retention of intron 8, two with alternative exons (in introns 1 and 6) and two that retained complete introns (introns 3 and 9). The amount of alternatively spliced XPG mRNA isoforms varied in different tissues. Most alternative splice donor and acceptor sites had a relatively high information content, but one has the U12 spliceosome sequence. A single nucleotide polymorphism has allele frequencies of 0.74 for 3507G and 0.26 for 3507C in 91 donors. The human XPG gene contains multiple splice sites with low information content in association with multiple alternatively spliced isoforms of XPG mRNA.
Resumo:
As the number of protein folds is quite limited, a mode of analysis that will be increasingly common in the future, especially with the advent of structural genomics, is to survey and re-survey the finite parts list of folds from an expanding number of perspectives. We have developed a new resource, called PartsList, that lets one dynamically perform these comparative fold surveys. It is available on the web at http://bioinfo.mbb.yale.edu/partslist and http://www.partslist.org. The system is based on the existing fold classifications and functions as a form of companion annotation for them, providing ‘global views’ of many already completed fold surveys. The central idea in the system is that of comparison through ranking; PartsList will rank the approximately 420 folds based on more than 180 attributes. These include: (i) occurrence in a number of completely sequenced genomes (e.g. it will show the most common folds in the worm versus yeast); (ii) occurrence in the structure databank (e.g. most common folds in the PDB); (iii) both absolute and relative gene expression information (e.g. most changing folds in expression over the cell cycle); (iv) protein–protein interactions, based on experimental data in yeast and comprehensive PDB surveys (e.g. most interacting fold); (v) sensitivity to inserted transposons; (vi) the number of functions associated with the fold (e.g. most multi-functional folds); (vii) amino acid composition (e.g. most Cys-rich folds); (viii) protein motions (e.g. most mobile folds); and (ix) the level of similarity based on a comprehensive set of structural alignments (e.g. most structurally variable folds). The integration of whole-genome expression and protein–protein interaction data with structural information is a particularly novel feature of our system. We provide three ways of visualizing the rankings: a profiler emphasizing the progression of high and low ranks across many pre-selected attributes, a dynamic comparer for custom comparisons and a numerical rankings correlator. These allow one to directly compare very different attributes of a fold (e.g. expression level, genome occurrence and maximum motion) in the uniform numerical format of ranks. This uniform framework, in turn, highlights the way that the frequency of many of the attributes falls off with approximate power-law behavior (i.e. according to V–b, for attribute value V and constant exponent b), with a few folds having large values and most having small values.
Resumo:
In vivo assessment of gene expression is desirable to obtain information on the extent and duration of transduction of tissue after gene delivery. We have developed an in vivo, potentially noninvasive, method for detecting virally mediated gene transfer to the liver. The method employs an adenoviral vector carrying the gene for the brain isozyme of murine creatine kinase (CK-B), an ATP-buffering enzyme expressed mainly in muscle and brain but absent from liver, kidney, and pancreas. Gene expression was monitored by 31P magnetic resonance spectroscopy (MRS) using the product of the CK enzymatic reaction, phosphocreatine, as an indicator of transfection. The vector was administered into nude mice by tail vein injection, and exogenous creatine was administered in the drinking water and by i.p. injection of 2% creatine solution before 31P MRS examination, which was performed on surgically exposed livers. A phosphocreatine resonance was detected in livers of mice injected with the vector and was absent from livers of control animals. CK expression was confirmed in the injected animals by Western blot analysis, enzymatic assays, and immunofluorescence measurements. We conclude that the syngeneic enzyme CK can be used as a marker gene for in vivo monitoring of gene expression after virally mediated gene transfer to the liver.
Resumo:
Wild-type Chlamydomonas reinhardtii cells shifted from high concentrations (5%) of CO2 to low, ambient levels (0.03%) rapidly increase transcription of mRNAs from several CO2-responsive genes. Simultaneously, they develop a functional carbon concentrating mechanism that allows the cells to greatly increase internal levels of CO2 and HCO\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{equation*}{\mathrm{_{3}^{-}}}\end{equation*}\end{document}. The cia5 mutant is defective in all of these phenotypes. A newly isolated gene, designated Cia5, restores transformed cia5 cells to the phenotype of wild-type cells. The 6,481-bp gene produces a 5.1-kb mRNA that is present constitutively in light in high and low CO2 both in wild-type cells and the cia5 mutant. It encodes a protein that has features of a putative transcription factor and that, likewise, is present constitutively in low and high CO2 conditions. Complementation of cia5 can be achieved with a truncated Cia5 gene that is missing the coding information for 54 C-terminal amino acids. Unlike wild-type cells or cia5 mutants transformed with an intact Cia5 gene, cia5 mutants complemented with the truncated gene exhibit constitutive synthesis of mRNAs from CO2-responsive genes in light under both high and low CO2 conditions. These discoveries suggest that posttranslational changes to the C-terminal domain control the ability of CIA5 to act as an inducer and directly or indirectly control transcription of CO2-responsive genes. Thus, CIA5 appears to be a master regulator of the carbon concentrating mechanism and is intimately involved in the signal transduction mechanism that senses and allows immediate responses to fluctuations in environmental CO2 and HCO\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{equation*}{\mathrm{_{3}^{-}}}\end{equation*}\end{document} concentrations.