2 resultados para conserved noncoding sequence
em AMS Tesi di Dottorato - Alm@DL - Università di Bologna
Resumo:
Cardiac morphogenesis is a complex process governed by evolutionarily conserved transcription factors and signaling molecules. The Drosophila cardiac tube is linear, made of 52 pairs of cardiomyocytes (CMs), which express specific transcription factor genes that have human homologues implicated in Congenital Heart Diseases (CHDs) (NKX2-5, GATA4 and TBX5). The Drosophila cardiac tube is linear and composed of a rostral portion named aorta and a caudal one called heart, distinguished by morphological and functional differences controlled by Hox genes, key regulators of axial patterning. Overexpression and inactivation of the Hox gene abdominal-A (abd-A), which is expressed exclusively in the heart, revealed that abd-A controls heart identity. The aim of our work is to isolate the heart-specific cisregulatory sequences of abd-A direct target genes, the realizator genes granting heart identity. In each segment of the heart, four pairs of cardiomyocytes (CMs) express tinman (tin), homologous to NKX2-5, and acquire strong contractile and automatic rhythmic activities. By tyramide amplified FISH, we found that seven genes, encoding ion channels, pumps or transporters, are specifically expressed in the Tin-CMs of the heart. We initially used online available tools to identify their heart-specific cisregutatory modules by looking for Conserved Non-coding Sequences containing clusters of binding sites for various cardiac transcription factors, including Hox proteins. Based on these data we generated several reporter gene constructs and transgenic embryos, but none of them showed reporter gene expression in the heart. In order to identify additional abd-A target genes, we performed microarray experiments comparing the transcriptomes of aorta versus heart and identified 144 genes overexpressed in the heart. In order to find the heart-specific cis-regulatory regions of these target genes we developed a new bioinformatic approach where prediction is based on pattern matching and ordered statistics. We first retrieved Conserved Noncoding Sequences from the alignment between the D.melanogaster and D.pseudobscura genomes. We scored for combinations of conserved occurrences of ABD-A, ABD-B, TIN, PNR, dMEF2, MADS box, T-box and E-box sites and we ranked these results based on two independent strategies. On one hand we ranked the putative cis-regulatory sequences according to best scored ABD-A biding sites, on the other hand we scored according to conservation of binding sites. We integrated and ranked again the two lists obtained independently to produce a final rank. We generated nGFP reporter construct flies for in vivo validation. We identified three 1kblong heart-specific enhancers. By in vivo and in vitro experiments we are determining whether they are direct abd-A targets, demonstrating the role of a Hox gene in the realization of heart identity. The identified abd-A direct target genes may be targets also of the NKX2-5, GATA4 and/or TBX5 homologues tin, pannier and Doc genes, respectively. The identification of sequences coregulated by a Hox protein and the homologues of transcription factors causing CHDs, will provide a mean to test whether these factors function as Hox cofactors granting cardiac specificity to Hox proteins, increasing our knowledge on the molecular mechanisms underlying CHDs. Finally, it may be investigated whether these Hox targets are involved in CHDs.
Resumo:
The objective of this work is to characterize the genome of the chromosome 1 of A.thaliana, a small flowering plants used as a model organism in studies of biology and genetics, on the basis of a recent mathematical model of the genetic code. I analyze and compare different portions of the genome: genes, exons, coding sequences (CDS), introns, long introns, intergenes, untranslated regions (UTR) and regulatory sequences. In order to accomplish the task, I transformed nucleotide sequences into binary sequences based on the definition of the three different dichotomic classes. The descriptive analysis of binary strings indicate the presence of regularities in each portion of the genome considered. In particular, there are remarkable differences between coding sequences (CDS and exons) and non-coding sequences, suggesting that the frame is important only for coding sequences and that dichotomic classes can be useful to recognize them. Then, I assessed the existence of short-range dependence between binary sequences computed on the basis of the different dichotomic classes. I used three different measures of dependence: the well-known chi-squared test and two indices derived from the concept of entropy i.e. Mutual Information (MI) and Sρ, a normalized version of the “Bhattacharya Hellinger Matusita distance”. The results show that there is a significant short-range dependence structure only for the coding sequences whose existence is a clue of an underlying error detection and correction mechanism. No doubt, further studies are needed in order to assess how the information carried by dichotomic classes could discriminate between coding and noncoding sequence and, therefore, contribute to unveil the role of the mathematical structure in error detection and correction mechanisms. Still, I have shown the potential of the approach presented for understanding the management of genetic information.