965 resultados para Genes Regulatory Sequences
Resumo:
While genome sequencing projects are advancing rapidly, EST sequencing and analysis remains a primary research tool for the identification and categorization of gene sequences in a wide variety of species and an important resource for annotation of genomic sequence. The TIGR Gene Indices (http://www.tigr.org/tdb/tgi.shtml) are a collection of species-specific databases that use a highly refined protocol to analyze EST sequences in an attempt to identify the genes represented by that data and to provide additional information regarding those genes. Gene Indices are constructed by first clustering, then assembling EST and annotated gene sequences from GenBank for the targeted species. This process produces a set of unique, high-fidelity virtual transcripts, or Tentative Consensus (TC) sequences. The TC sequences can be used to provide putative genes with functional annotation, to link the transcripts to mapping and genomic sequence data, to provide links between orthologous and paralogous genes and as a resource for comparative sequence analysis.
Resumo:
The objective of database AsMamDB is to facilitate the systematic study of alternatively spliced genes of mammals. Version 1.0 of AsMamDB contains 1563 alternatively spliced genes of human, mouse and rat, each associated with a cluster of nucleotide sequences. The main information provided by AsMamDB includes gene alternative splicing patterns, gene structures, locations in chromosomes, products of genes and tissues where they express. Alternative splicing patterns are represented by multiple alignments of various gene transcripts and by graphs of their topological structures. Gene structures are illustrated by exon, intron and various regulatory elements distributions. There are 4204 DNAs, 3977 mRNAs, 8989 CDSs and 126 931 ESTs in the current database. More than 130 000 GenBank entries are covered and 4443 MEDLINE records are linked. DNA, mRNA, exon, intron and relevant regulatory element sequences are provided in FASTA format. More information can be obtained by using the web-based multiple alignment tool Asalign and various category lists. AsMamDB can be accessed at http://166.111.30.6 5/ASMAM DB.html.
Resumo:
rSNP_Guide is a novel curated database system for analysis of transcription factor (TF) binding to target sequences in regulatory gene regions altered by mutations. It accumulates experimental data on naturally occurring site variants in regulatory gene regions and site-directed mutations. This database system also contains the web tools for SNP analysis, i.e., active applet applying weight matrices to predict the regulatory site candidates altered by a mutation. The current version of the rSNP_Guide is supplemented by six sub-databases: (i) rSNP_DB, on DNA–protein interaction caused by mutation; (ii) SYSTEM, on experimental systems; (iii) rSNP_BIB, on citations to original publications; (iv) SAMPLES, on experimentally identified sequences of known regulatory sites; (v) MATRIX, on weight matrices of known TF sites; (vi) rSNP_Report, on characteristic examples of successful rSNP_Tools implementation. These databases are useful for the analysis of natural SNPs and site-directed mutations. The databases are available through the Web, http://wwwmgs.bionet.nsc.ru/mgs/systems/rsnp/.
Resumo:
High throughput genome (HTG) and expressed sequence tag (EST) sequences are currently the most abundant nucleotide sequence classes in the public database. The large volume, high degree of fragmentation and lack of gene structure annotations prevent efficient and effective searches of HTG and EST data for protein sequence homologies by standard search methods. Here, we briefly describe three newly developed resources that should make discovery of interesting genes in these sequence classes easier in the future, especially to biologists not having access to a powerful local bioinformatics environment. trEST and trGEN are regularly regenerated databases of hypothetical protein sequences predicted from EST and HTG sequences, respectively. Hits is a web-based data retrieval and analysis system providing access to precomputed matches between protein sequences (including sequences from trEST and trGEN) and patterns and profiles from Prosite and Pfam. The three resources can be accessed via the Hits home page (http://hits.isb-sib.ch).
Resumo:
The extremely halophilic archaeon Halobacterium sp. NRC-1 can grow phototrophically by means of light-driven proton pumping by bacteriorhodopsin in the purple membrane. Here, we show by genetic analysis of the wild type, and insertion and double-frame shift mutants of Bat that this transcriptional regulator coordinates synthesis of a structural protein and a chromophore for purple membrane biogenesis in response to both light and oxygen. Analysis of the complete Halobacterium sp. NRC-1 genome sequence showed that the regulatory site, upstream activator sequence (UAS), the putative binding site for Bat upstream of the bacterio-opsin gene (bop), is also present upstream to the other Bat-regulated genes. The transcription regulator Bat contains a photoresponsive cGMP-binding (GAF) domain, and a bacterial AraC type helix–turn–helix DNA binding motif. We also provide evidence for involvement of the PAS/PAC domain of Bat in redox-sensing activity by genetic analysis of a purple membrane overproducer. Five additional Bat-like putative regulatory genes were found, which together are likely to be responsible for orchestrating the complex response of this archaeon to light and oxygen. Similarities of the bop-like UAS and transcription factors in diverse organisms, including a plant and a γ-proteobacterium, suggest an ancient origin for this regulon capable of coordinating light and oxygen responses in the three major branches of the evolutionary tree of life. Finally, sensitivity of four of five regulon genes to DNA supercoiling is demonstrated and correlated to presence of alternating purine–pyrimidine sequences (RY boxes) near the regulated promoters.
Resumo:
Gene expression profiling provides powerful analyses of transcriptional responses to cellular perturbation. In contrast to DNA array-based methods, reporter gene technology has been underused for this application. Here we describe a genomewide, genome-registered collection of Escherichia coli bioluminescent reporter gene fusions. DNA sequences from plasmid-borne, random fusions of E. coli chromosomal DNA to a Photorhabdus luminescens luxCDABE reporter allowed precise mapping of each fusion. The utility of this collection covering about 30% of the transcriptional units was tested by analyzing individual fusions representative of heat shock, SOS, OxyR, SoxRS, and cya/crp stress-responsive regulons. Each fusion strain responded as anticipated to environmental conditions known to activate the corresponding regulatory circuit. Thus, the collection mirrors E. coli's transcriptional wiring diagram. This genomewide collection of gene fusions provides an independent test of results from other gene expression analyses. Accordingly, a DNA microarray-based analysis of mitomycin C-treated E. coli indicated elevated expression of expected and unanticipated genes. Selected luxCDABE fusions corresponding to these up-regulated genes were used to confirm or contradict the DNA microarray results. The power of partnering gene fusion and DNA microarray technology to discover promoters and define operons was demonstrated when data from both suggested that a cluster of 20 genes encoding production of type I extracellular polysaccharide in E. coli form a single operon.
Resumo:
The influenza A virus pandemic of 1918–1919 resulted in an estimated 20–40 million deaths worldwide. The hemagglutinin and neuraminidase sequences of the 1918 virus were previously determined. We here report the sequence of the A/Brevig Mission/1/18 (H1N1) virus nonstructural (NS) segment encoding two proteins, NS1 and nuclear export protein. Phylogenetically, these genes appear to be close to the common ancestor of subsequent human and classical swine strain NS genes. Recently, the influenza A virus NS1 protein was shown to be a type I IFN antagonist that plays an important role in viral pathogenesis. By using the recently developed technique of generating influenza A viruses entirely from cloned cDNAs, the hypothesis that the 1918 virus NS1 gene played a role in virulence was tested in a mouse model. In a BSL3+ laboratory, viruses were generated that possessed either the 1918 NS1 gene alone or the entire 1918 NS segment in a background of influenza A/WSN/33 (H1N1), a mouse-adapted virus derived from a human influenza strain first isolated in 1933. These 1918 NS viruses replicated well in tissue culture but were attenuated in mice as compared with the isogenic control viruses. This attenuation in mice may be related to the human origin of the 1918 NS1 gene. These results suggest that interaction of the NS1 protein with host-cell factors plays a significant role in viral pathogenesis.
Resumo:
An emerging theme in transforming growth factor-β (TGF-β) signalling is the association of the Smad proteins with diverse groups of transcriptional regulatory proteins. Several Smad cofactors have been identified to date but the diversity of TGF-β effects on gene transcription suggests that interactions with other co-regulators must occur. In these studies we addressed the possible interaction of Smad proteins with the myocyte enhancer-binding factor 2 (MEF2) transcriptional regulators. Our studies indicate that Smad2 and 4 (Smad2/4) complexes cooperate with MEF2 regulatory proteins in a GAL4-based one-hybrid reporter gene assay. We have also observed in vivo interactions between Smad2 and MEF2A using co-immunoprecipitation assays. This interaction is confirmed by glutathione S-transferase pull-down analysis. Immunofluorescence studies in C2C12 myotubes show that Smad2 and MEF2A co-localise in the nucleus of multinuclear myotubes during differentiation. Interestingly, phospho-acceptor site mutations of MEF2 that render it unresponsive to p38 MAP kinase signalling abrogate the cooperativity with the Smads suggesting that p38 MAP Kinase-catalysed phosphorylation of MEF2 is a prerequisite for the Smad–MEF2 interaction. Thus, the association between Smad2 and MEF2A may subserve a physical link between TGF-β signalling and a diverse array of genes controlled by the MEF2 cis element.
Resumo:
Bacterial tmRNA mediates a trans-translation reaction, which permits the recycling of stalled ribosomes and probably also contributes to the regulated expression of a subset of genes. Its action results in the addition of a small number of C-terminal amino acids to protein whose synthesis had stalled and these constitute a proteolytic recognition tag for the degradation of these incompletely synthesized proteins. Previous work has identified pseudoknots and stem–loops that are widely conserved in divergent bacteria. In the present work an alignment of tmRNA gene sequences within 13 β-proteobacteria reveals an additional sub-structure specific for this bacterial group. This sub-structure is in pseudoknot Pk2, and consists of one to two additional stem–loop(s) capped by stable GNRA tetraloop(s). Three-dimensional models of tmRNA pseudoknot 2 (Pk2) containing various topological versions of the additional sub-structure suggest that the sub-structures likely point away from the core of the RNA, containing both the tRNA and the mRNA domains. A putative tertiary interaction has also been identified.
Resumo:
Typical general transcription factors, such as TATA binding protein and TFII B, have not yet been identified in any member of the Trypanosomatidae family of parasitic protozoa. Interestingly, mRNA coding genes do not appear to have discrete transcriptional start sites, although in most cases they require an RNA polymerase that has the biochemical properties of eukaryotic RNA polymerase II. A discrete transcription initiation site may not be necessary for mRNA synthesis since the sequences upstream of each transcribed coding region are trimmed from the nascent transcript when a short m7G-capped RNA is added during mRNA maturation. This short 39 nt m7G-capped RNA, the spliced leader (SL) sequence, is expressed as an ∼100 nt long RNA from a set of reiterated, though independently transcribed, genes in the trypanosome genome. Punctuation of the 5′ end of mRNAs by a m7G cap-containing spliced leader is a developing theme in the lower eukaryotic world; organisms as diverse as Euglena and nematode worms, including Caenorhabditis elegans, utilize SL RNA in their mRNA maturation programs. Towards understanding the coordination of SL RNA and mRNA expression in trypanosomes, we have begun by characterizing SL RNA gene expression in the model trypanosome Leptomonas seymouri. Using a homologous in vitro transcription system, we demonstrate in this study that the SL RNA is transcribed by RNA polymerase II. During SL RNA transcription, accurate initiation is determined by an initiator element with a loose consensus of CYAC/AYR(+1). This element, as well as two additional basal promoter elements, is divergent in sequence from the basal transcription elements seen in other eukaryotic gene promoters. We show here that the in vitro transcription extract contains a binding activity that is specific for the initiator element and thus may participate in recruiting RNA polymerase II to the SL RNA gene promoter.
Resumo:
Of the rules used by the splicing machinery to precisely determine intron–exon boundaries only a fraction is known. Recent evidence suggests that specific short sequences within exons help in defining these boundaries. Such sequences are known as exonic splicing enhancers (ESE). A possible bioinformatical approach to studying ESE sequences is to compare genes that harbor introns with genes that do not. For this purpose two non-redundant samples of 719 intron-containing and 63 intron-lacking human genes were created. We performed a statistical analysis on these datasets of intron-containing and intron-lacking human coding sequences and found a statistically significant difference (P = 0.01) between these samples in terms of 5–6mer oligonucleotide distributions. The difference is not created by a few strong signals present in the majority of exons, but rather by the accumulation of multiple weak signals through small variations in codon frequencies, codon biases and context-dependent codon biases between the samples. A list of putative novel human splicing regulation sequences has been elucidated by our analysis.
Resumo:
We present here the description of genes coding for molluscan hemocyanins. Two distantly related mollusks, Haliotis tuberculata and Octopus dofleini, were studied. The typical architecture of a molluscan hemocyanin subunit, which is a string of seven or eight globular functional units (FUs, designated a to h, about 50 kDa each), is reflected by the gene organization: a series of eight structurally related coding regions in Haliotis, corresponding to FU-a to FU-h, with seven highly variable linker introns of 174 to 3,198 bp length (all in phase 1). In Octopus seven coding regions (FU-a to FU-g) are found, separated by phase 1 introns varying in length from 100 bp to 910 bp. Both genes exhibit typical signal (export) sequences, and in both cases these are interrupted by an additional intron. Each gene also contains an intron between signal peptide and FU-a and in the 3′ untranslated region. Of special relevance for evolutionary considerations are introns interrupting those regions that encode a discrete functional unit. We found that five of the eight FUs in Haliotis each are encoded by a single exon, whereas FU-f, FU-g, and FU-a are encoded by two, three and four exons, respectively. Similarly, in Octopus four of the FUs each correspond to an uninterrupted exon, whereas FU-b, FU-e, and FU-f each contain a single intron. Although the positioning of the introns between FUs is highly conserved in the two mollusks, the introns within FUs show no relationship either in location nor phase. It is proposed that the introns between FUs were generated as the eight-unit polypeptide evolved from a monomeric precursor, and that the internal introns have been added later. A hypothesis for evolution of the ring-like quaternary structure of molluscan hemocyanins is presented.
Resumo:
We cloned a cDNA for a gibberellin-induced ribonuclease (RNase) expressed in barley (Hordeum vulgare) aleurone and the gene for a second barley RNase expressed in leaf tissue. The protein encoded by the cDNA is unique among RNases described to date in that it contains a novel 23-amino acid insert between the C2 and C3 conserved sequences. Expression of the recombinant protein in tobacco (Nicotiana tabacum) suspension-cultured protoplasts gave an active RNase of the expected size, confirming the enzymatic activity of the protein. Analyses of hormone regulation of expression of mRNA for the aleurone RNase revealed that, like the pattern for α-amylase, mRNA levels increased in the presence of gibberellic acid, and its antagonist abscisic acid prevented this effect. Quantitative studies at early times demonstrated that cycloheximide treatment of aleurone layers increased mRNA levels 4-fold, whereas a combination of gibberellin plus cycloheximide treatment was required to increase α-amylase mRNA levels to the same extent. These results are consistent with loss of repression as an initial effect of gibberellic acid on transcription of those genes, although the regulatory pathways for the two genes may differ.
Resumo:
Sinapic acid is an intermediate in syringyl lignin biosynthesis in angiosperms, and in some taxa serves as a precursor for soluble secondary metabolites. The biosynthesis and accumulation of the sinapate esters sinapoylglucose, sinapoylmalate, and sinapoylcholine are developmentally regulated in Arabidopsis and other members of the Brassicaceae. The FAH1 locus of Arabidopsis encodes the enzyme ferulate-5-hydroxylase (F5H), which catalyzes the rate-limiting step in syringyl lignin biosynthesis and is required for the production of sinapate esters. Here we show that F5H expression parallels sinapate ester accumulation in developing siliques and seedlings, but is not rate limiting for their biosynthesis. RNA gel-blot analysis indicated that the tissue-specific and developmentally regulated expression of F5H mRNA is distinct from that of other phenylpropanoid genes. Efforts to identify constructs capable of complementing the sinapate ester-deficient phenotype of fah1 mutants demonstrated that F5H expression in leaves is dependent on sequences 3′ of the F5H coding region. In contrast, the positive regulatory function of the downstream region is not required for F5H transcript or sinapoylcholine accumulation in embryos.
Resumo:
Aquatic photosynthetic organisms, including the green alga Chlamydomonas reinhardtii, induce a set of genes for a carbon-concentrating mechanism (CCM) to acclimate to CO2-limiting conditions. This acclimation is modulated by some mechanisms in the cell to sense CO2 availability. Previously, a high-CO2-requiring mutant C16 defective in an induction of the CCM was isolated from C. reinhardtii by gene tagging. By using this pleiotropic mutant, we isolated a nuclear regulatory gene, Ccm1, encoding a 699-aa hydrophilic protein with a putative zinc-finger motif in its N-terminal region and a Gln repeat characteristic of transcriptional activators. Introduction of Ccm1 into this mutant restored an active carbon transport through the CCM, development of a pyrenoid structure in the chloroplast, and induction of a set of CCM-related genes. That a 5,128-base Ccm1 transcript and also the translation product of 76 kDa were detected in both high- and low-CO2 conditions suggests that CCM1 might be modified posttranslationally. These data indicate that Ccm1 is essential to control the induction of CCM by sensing CO2 availability in Chlamydomonas cells. In addition, complementation assay and identification of the mutation site of another pleiotropic mutant, cia5, revealed that His-54 within the putative zinc-finger motif of the CCM1 is crucial to its regulatory function.