952 resultados para Noncoding Sequences
Resumo:
Genomic sequence comparison across species has enabled the elucidation of important coding and regulatory sequences encoded within DNA. Of particular interest are the noncoding regulatory sequences, which influence gene transcriptional and posttranscriptional processes. A phylogenetic footprinting strategy was employed to identify noncoding conservation patterns of 39 human and bovine orthologous genes. Seventy-three conserved noncoding sequences were identified that shared greater than 70% identity over at least 100 bp. Thirteen of these conserved sequences were also identified in the mouse genome. Evolutionary conservation of noncoding sequences across diverse species may have functional significance, and these conserved sequences may be good candidates for regulatory elements.
Resumo:
Plasmodium vivax parasites with chloroquine resistance (CQR) are already circulating in the Brazilian Amazon. Complete single-nucleotide polymorphism (SNP) analyses of coding and noncoding sequences of the pvmdr1 and pvcrt-o genes revealed no associations with CQR, even if some mutations had not been randomly selected. In addition, striking differences in the topologies and numbers of SNPs in these transporter genes between P. vivax and P. falciparum reinforce the idea that mechanisms other than mutations may explain this virulent phenotype in P. vivax.
Resumo:
Background. Visceral leishmaniasis (VL) is caused by Leishmania donovani and Leishmania infantum chagasi. Genome-wide linkage studies from Sudan and Brazil identified a putative susceptibility locus on chromosome 6q27. Methods. Twenty-two single-nucleotide polymorphisms (SNPs) at genes PHF10, C6orf70, DLL1, FAM120B, PSMB1, and TBP were genotyped in 193 VL cases from 85 Sudanese families, and 8 SNPs at genes PHF10, C6orf70, DLL1, PSMB1, and TBP were genotyped in 194 VL cases from 80 Brazilian families. Family-based association, haplotype, and linkage disequilibrium analyses were performed. Multispecies comparative sequence analysis was used to identify conserved noncoding sequences carrying putative regulatory elements. Quantitative reverse-transcription polymerase chain reaction measured expression of candidate genes in splenic aspirates from Indian patients with VL compared with that in the control spleen sample. Results. Positive associations were observed at PHF10, C6orf70, DLL1, PSMB1, and TBP in Sudan, but only at DLL1 in Brazil (combined P = 3 x 10(-4) at DLL1 across Sudan and Brazil). No functional coding region variants were observed in resequencing of 22 Sudanese VL cases. DLL1 expression was significantly (P = 2 x 10(-7)) reduced (mean fold change, 3.5 [SEM, 0.7]) in splenic aspirates from patients with VL, whereas other 6q27 genes showed higher levels (1.27 x 10(-6) < P < .01) than did the control spleen sample. A cluster of conserved noncoding sequences with putative regulatory variants was identified in the distal promoter of DLL1. Conclusions. DLL1, which encodes Delta-like 1, the ligand for Notch3, is strongly implicated as the chromosome 6q27 VL susceptibility gene.
Resumo:
Cardiac morphogenesis is a complex process governed by evolutionarily conserved transcription factors and signaling molecules. The Drosophila cardiac tube is linear, made of 52 pairs of cardiomyocytes (CMs), which express specific transcription factor genes that have human homologues implicated in Congenital Heart Diseases (CHDs) (NKX2-5, GATA4 and TBX5). The Drosophila cardiac tube is linear and composed of a rostral portion named aorta and a caudal one called heart, distinguished by morphological and functional differences controlled by Hox genes, key regulators of axial patterning. Overexpression and inactivation of the Hox gene abdominal-A (abd-A), which is expressed exclusively in the heart, revealed that abd-A controls heart identity. The aim of our work is to isolate the heart-specific cisregulatory sequences of abd-A direct target genes, the realizator genes granting heart identity. In each segment of the heart, four pairs of cardiomyocytes (CMs) express tinman (tin), homologous to NKX2-5, and acquire strong contractile and automatic rhythmic activities. By tyramide amplified FISH, we found that seven genes, encoding ion channels, pumps or transporters, are specifically expressed in the Tin-CMs of the heart. We initially used online available tools to identify their heart-specific cisregutatory modules by looking for Conserved Non-coding Sequences containing clusters of binding sites for various cardiac transcription factors, including Hox proteins. Based on these data we generated several reporter gene constructs and transgenic embryos, but none of them showed reporter gene expression in the heart. In order to identify additional abd-A target genes, we performed microarray experiments comparing the transcriptomes of aorta versus heart and identified 144 genes overexpressed in the heart. In order to find the heart-specific cis-regulatory regions of these target genes we developed a new bioinformatic approach where prediction is based on pattern matching and ordered statistics. We first retrieved Conserved Noncoding Sequences from the alignment between the D.melanogaster and D.pseudobscura genomes. We scored for combinations of conserved occurrences of ABD-A, ABD-B, TIN, PNR, dMEF2, MADS box, T-box and E-box sites and we ranked these results based on two independent strategies. On one hand we ranked the putative cis-regulatory sequences according to best scored ABD-A biding sites, on the other hand we scored according to conservation of binding sites. We integrated and ranked again the two lists obtained independently to produce a final rank. We generated nGFP reporter construct flies for in vivo validation. We identified three 1kblong heart-specific enhancers. By in vivo and in vitro experiments we are determining whether they are direct abd-A targets, demonstrating the role of a Hox gene in the realization of heart identity. The identified abd-A direct target genes may be targets also of the NKX2-5, GATA4 and/or TBX5 homologues tin, pannier and Doc genes, respectively. The identification of sequences coregulated by a Hox protein and the homologues of transcription factors causing CHDs, will provide a mean to test whether these factors function as Hox cofactors granting cardiac specificity to Hox proteins, increasing our knowledge on the molecular mechanisms underlying CHDs. Finally, it may be investigated whether these Hox targets are involved in CHDs.
Resumo:
Mutations in the FBN1 gene are the major cause of Marfan syndrome (MFS), an autosomal dominant connective tissue disorder, which displays variable manifestations in the cardiovascular, ocular, and skeletal systems. Current molecular genetic testing of FBN1 may miss mutations in the promoter region or in other noncoding sequences as well as partial or complete gene deletions and duplications. In this study, we tested for copy number variations by successively applying multiplex ligation-dependent probe amplification (MLPA) and the Affymetrix Human Mapping 500 K Array Set, which contains probes for approximately 500,000 single-nucleotide polymorphisms (SNPs) across the genome. By analyzing genomic DNA of 101 unrelated individuals with MFS or related phenotypes in whom standard genetic testing detected no mutation, we identified FBN1 deletions in two patients with MFS. Our high-resolution approach narrowed down the deletion breakpoints. Subsequent sequencing of the junctional fragments revealed the deletion sizes of 26,887 and 302,580 bp, respectively. Surprisingly, both deletions affect the putative regulatory and promoter region of the FBN1 gene, strongly indicating that they abolish transcription of the deleted allele. This expectation of complete loss of function of one allele, i.e. true haploinsufficiency, was confirmed by transcript analyses. Our findings not only emphasize the importance of screening for large genomic rearrangements in comprehensive genetic testing of FBN1 but, importantly, also extend the molecular etiology of MFS by providing hitherto unreported evidence that true haploinsufficiency is sufficient to cause MFS.
Resumo:
Editing of RNA changes the read-out of information from DNA by altering the nucleotide sequence of a transcript. One type of RNA editing found in all metazoans uses double-stranded RNA (dsRNA) as a substrate and results in the deamination of adenosine to give inosine, which is translated as guanosine. Editing thus allows variant proteins to be produced from a single pre-mRNA. A mechanism by which dsRNA substrates form is through pairing of intronic and exonic sequences before the removal of noncoding sequences by splicing. Here we report that the RNA editing enzyme, human dsRNA adenosine deaminase (DRADA1, or ADAR1) contains a domain (Zα) that binds specifically to the left-handed Z-DNA conformation with high affinity (KD = 4 nM). As formation of Z-DNA in vivo occurs 5′ to, or behind, a moving RNA polymerase during transcription, recognition of Z-DNA by DRADA1 provides a plausible mechanism by which DRADA1 can be targeted to a nascent RNA so that editing occurs before splicing. Analysis of sequences related to Zα has allowed identification of motifs common to this class of nucleic acid binding domain.
Resumo:
In this report we show that yeast expressing brome mosaic virus (BMV) replication proteins 1a and 2a and replicating a BMV RNA3 derivative can be extracted to yield a template-dependent BMV RNA-dependent RNA polymerase (RdRp) able to synthesize (-)-strand RNA from BMV (+)-strand RNA templates added in vitro. This virus-specific yeast-derived RdRp mirrored the template selectivity and other characteristics of RdRp from BMV-infected plants. Equivalent extracts from yeast expressing 1a and 2a but lacking RNA3 contained normal amounts of 1a and 2a but had no RdRp activity on BMV RNAs added in vitro. To determine which RNA3 sequences were required in vivo to yield RdRp activity, we tested deletions throughout RNA3, including the 5',3', and intercistronic noncoding regions, which contain the cis-acting elements required for RNA3 replication in vivo. RdRp activity was obtained only from cells expressing 1a, 2a, and RNA3 derivatives retaining both 3' and intercistronic noncoding sequences. Strong correlation between extracted RdRp activity and BMV (-)-strand RNA accumulation in vivo was found for all RNA3 derivatives tested. Thus, extractable in vitro RdRp activity paralleled formation of a complex capable of viral RNA synthesis in vivo. The results suggest that assembly of active RdRp requires not only viral proteins but also viral RNA, either to directly contribute some nontemplate function or to recruit essential host factors into the RdRp complex and that sequences at both the 3'-terminal initiation site and distant internal sites of RNA3 templates may participate in RdRp assembly and initiation of (-)-strand synthesis.
Resumo:
There are 481 segments longer than 200 base pairs (bp) that are absolutely conserved (100% identity with no insertions or deletions) between orthologous regions of the human, rat, and mouse genomes. Nearly all of these segments are also conserved in the chicken and dog genomes, with an average of 95 and 99% identity, respectively. Many are also significantly conserved in fish. These ultraconserved elements of the human genome are most often located either overlapping exons in genes involved in RNA processing or in introns or nearby genes involved in the regulation of transcription and development. Along with more than 5000 sequences of over 100 bp that are absolutely conserved among the three sequenced mammals, these represent a class of genetic elements whose functions and evolutionary origins are yet to be determined, but which are more highly conserved between these species than are proteins and appear to be essential for the ontogeny of mammals and other vertebrates.
Resumo:
Cross-species comparative genomics is a powerful strategy for identifying functional regulatory elements within noncoding DNA. In this paper, comparative analysis of human and mouse intronic sequences in the breast cancer susceptibility gene (BRCA1) revealed two evolutionarily conserved noncoding sequences (CNS) in intron 2, 5 kb downstream of the core BRCA1 promoter. The functionality of these elements was examined using homologous-recombination-based mutagenesis of reporter gene-tagged cosmids incorporating these regions and flanking sequences from the BRCA1 locus. This showed that CNS-1 and CNS-2 have differential transcriptional regulatory activity in epithelial cell lines. Mutation of CNS-1 significantly reduced reporter gene expression to 30% of control levels. Conversely mutation of CNS-2 increased expression to 200% of control levels. Regulation is at the level of transcription and shows promoter specificity. Both elements also specifically bind nuclear proteins in vitro. These studies demonstrate that the combination of comparative genomics and functional analysis is a successful strategy to identify novel regulatory elements and provide the first direct evidence that conserved noncoding sequences in BRCA1 regulate gene expression. (c) 2005 Elsevier Inc. All rights reserved.
Resumo:
Although MYB overexpression in colorectal cancer (CRC) is known to be a prognostic indicator for poor survival, the basis for this overexpression is unclear. Among multiple levels of MYB regulation, the most dynamic is the control of transcriptional elongation by sequences within intron I. The authors have proposed that this regulatory sequence is transcribed into an RNA stem-loop and 19-residue polyuridine tract, and is subject to mutation in CRC. When this region was examined in colorectal and breast carcinoma cell lines and tissues, the authors found frequent mutations only in CRC. It was determined that these mutations allowed increased transcription compared with the wild type sequence. These data suggest that this MYB regulatory region within intron I is subject to mutations in CRC but not breast cancer, perhaps consistent with the mutagenic insult that occurs within the colon and not mammary tissue. In CRC, these mutations may contribute to MYB overexpression, highlighting the importance of noncoding sequences in the regulation of key cancer genes. (c) 2006 Wiley-Liss, Inc.
Resumo:
Despite the presence of over 3 million transposons separated on average by similar to 500 bp, the human and mouse genomes each contain almost 1000 transposon-free regions (TFRs) over 10 kb in length. The majority of human TFRs correlate with orthologous TFRs in the mouse, despite the fact that most transposons are lineage specific. Many human TFRs also overlap with orthologous TFRs in the marsupial opossum, indicating that these regions have remained refractory to transposon insertion for long evolutionary periods. Over 90% of the bases covered by TFRs are noncoding, much of which is not highly conserved. Most TFRs are not associated with unusual nucleotide composition, but are significantly associated with genes encoding developmental regulators, suggesting that they represent extended regions of regulatory information that are largely unable to tolerate insertions, a conclusion difficult to reconcile with current conceptions of gene regulation.
Resumo:
Eukaryotic phenotypic diversity arises from multitasking of a core proteome of limited size. Multitasking is routine in computers, as well as in other sophisticated information systems, and requires multiple inputs and outputs to control and integrate network activity. Higher eukaryotes have a mosaic gene structure with a dual output, mRNA (protein-coding) sequences and introns, which are released from the pre-mRNA by posttranscriptional processing. Introns have been enormously successful as a class of sequences and comprise up to 95% of the primary transcripts of protein-coding genes in mammals. In addition, many other transcripts (perhaps more than half) do not encode proteins at all, but appear both to be developmentally regulated and to have genetic function. We suggest that these RNAs (eRNAs) have evolved to function as endogenous network control molecules which enable direct gene-gene communication and multitasking of eukaryotic genomes. Analysis of a range of complex genetic phenomena in which RNA is involved or implicated, including co-suppression, transgene silencing, RNA interference, imprinting, methylation, and transvection, suggests that a higher-order regulatory system based on RNA signals operates in the higher eukaryotes and involves chromatin remodeling as well as other RNA-DNA, RNA-RNA, and RNA-protein interactions. The evolution of densely connected gene networks would be expected to result in a relatively stable core proteome due to the multiple reuse of components, implying,that cellular differentiation and phenotypic variation in the higher eukaryotes results primarily from variation in the control architecture. Thus, network integration and multitasking using trans-acting RNA molecules produced in parallel with protein-coding sequences may underpin both the evolution of developmentally sophisticated multicellular organisms and the rapid expansion of phenotypic complexity into uncontested environments such as those initiated in the Cambrian radiation and those seen after major extinction events.
Biased gene conversion and GC-content evolution in the coding sequences of reptiles and vertebrates.
Resumo:
Mammalian and avian genomes are characterized by a substantial spatial heterogeneity of GC-content, which is often interpreted as reflecting the effect of local GC-biased gene conversion (gBGC), a meiotic repair bias that favors G and C over A and T alleles in high-recombining genomic regions. Surprisingly, the first fully sequenced nonavian sauropsid (i.e., reptile), the green anole Anolis carolinensis, revealed a highly homogeneous genomic GC-content landscape, suggesting the possibility that gBGC might not be at work in this lineage. Here, we analyze GC-content evolution at third-codon positions (GC3) in 44 vertebrates species, including eight newly sequenced transcriptomes, with a specific focus on nonavian sauropsids. We report that reptiles, including the green anole, have a genome-wide distribution of GC3 similar to that of mammals and birds, and we infer a strong GC3-heterogeneity to be already present in the tetrapod ancestor. We further show that the dynamic of coding sequence GC-content is largely governed by karyotypic features in vertebrates, notably in the green anole, in agreement with the gBGC hypothesis. The discrepancy between third-codon positions and noncoding DNA regarding GC-content dynamics in the green anole could not be explained by the activity of transposable elements or selection on codon usage. This analysis highlights the unique value of third-codon positions as an insertion/deletion-free marker of nucleotide substitution biases that ultimately affect the evolution of proteins.
Resumo:
The key information processing units within gene regulatory networks are enhancers. Enhancer activity is associated with the production of tissue-specific noncoding RNAs, yet the existence of such transcripts during cardiac development has not been established. Using an integrated genomic approach, we demonstrate that fetal cardiac enhancers generate long noncoding RNAs (lncRNAs) during cardiac differentiation and morphogenesis. Enhancer expression correlates with the emergence of active enhancer chromatin states, the initiation of RNA polymerase II at enhancer loci and expression of target genes. Orthologous human sequences are also transcribed in fetal human hearts and cardiac progenitor cells. Through a systematic bioinformatic analysis, we identified and characterized, for the first time, a catalog of lncRNAs that are expressed during embryonic stem cell differentiation into cardiomyocytes and associated with active cardiac enhancer sequences. RNA-sequencing demonstrates that many of these transcripts are polyadenylated, multi-exonic long noncoding RNAs. Moreover, knockdown of two enhancer-associated lncRNAs resulted in the specific downregulation of their predicted target genes. Interestingly, the reactivation of the fetal gene program, a hallmark of the stress response in the adult heart, is accompanied by increased expression of fetal cardiac enhancer transcripts. Altogether, these findings demonstrate that the activity of cardiac enhancers and expression of their target genes are associated with the production of enhancer-derived lncRNAs.