989 resultados para 060407 Genome Structure and Regulation
Resumo:
The main focus of this thesis is the use of high-throughput sequencing technologies in functional genomics (in particular in the form of ChIP-seq, chromatin immunoprecipitation coupled with sequencing, and RNA-seq) and the study of the structure and regulation of transcriptomes. Some parts of it are of a more methodological nature while others describe the application of these functional genomic tools to address various biological problems. A significant part of the research presented here was conducted as part of the ENCODE (ENCyclopedia Of DNA Elements) Project.
The first part of the thesis focuses on the structure and diversity of the human transcriptome. Chapter 1 contains an analysis of the diversity of the human polyadenylated transcriptome based on RNA-seq data generated for the ENCODE Project. Chapter 2 presents a simulation-based examination of the performance of some of the most popular computational tools used to assemble and quantify transcriptomes. Chapter 3 includes a study of variation in gene expression, alternative splicing and allelic expression bias on the single-cell level and on a genome-wide scale in human lymphoblastoid cells; it also brings forward a number of critical to the practice of single-cell RNA-seq measurements methodological considerations.
The second part presents several studies applying functional genomic tools to the study of the regulatory biology of organellar genomes, primarily in mammals but also in plants. Chapter 5 contains an analysis of the occupancy of the human mitochondrial genome by TFAM, an important structural and regulatory protein in mitochondria, using ChIP-seq. In Chapter 6, the mitochondrial DNA occupancy of the TFB2M transcriptional regulator, the MTERF termination factor, and the mitochondrial RNA and DNA polymerases is characterized. Chapter 7 consists of an investigation into the curious phenomenon of the physical association of nuclear transcription factors with mitochondrial DNA, based on the diverse collections of transcription factor ChIP-seq datasets generated by the ENCODE, mouseENCODE and modENCODE consortia. In Chapter 8 this line of research is further extended to existing publicly available ChIP-seq datasets in plants and their mitochondrial and plastid genomes.
The third part is dedicated to the analytical and experimental practice of ChIP-seq. As part of the ENCODE Project, a set of metrics for assessing the quality of ChIP-seq experiments was developed, and the results of this activity are presented in Chapter 9. These metrics were later used to carry out a global analysis of ChIP-seq quality in the published literature (Chapter 10). In Chapter 11, the development and initial application of an automated robotic ChIP-seq (in which these metrics also played a major role) is presented.
The fourth part presents the results of some additional projects the author has been involved in, including the study of the role of the Piwi protein in the transcriptional regulation of transposon expression in Drosophila (Chapter 12), and the use of single-cell RNA-seq to characterize the heterogeneity of gene expression during cellular reprogramming (Chapter 13).
The last part of the thesis provides a review of the results of the ENCODE Project and the interpretation of the complexity of the biochemical activity exhibited by mammalian genomes that they have revealed (Chapters 15 and 16), an overview of the expected in the near future technical developments and their impact on the field of functional genomics (Chapter 14), and a discussion of some so far insufficiently explored research areas, the future study of which will, in the opinion of the author, provide deep insights into many fundamental but not yet completely answered questions about the transcriptional biology of eukaryotes and its regulation.
Resumo:
Defining the precise promoter DNA sequence motifs where nuclear receptors and other transcription factors bind is an essential prerequisite for understanding how these proteins modulate the expression of their specific target genes. The purpose of this chapter is to provide the reader with a detailed guide with respect to the materials and the key methods required to perform this type of DNA-binding analysis. Irrespective of whether starting with purified DNA-binding proteins or somewhat crude cellular extracts, the tried-and-true procedures described here will enable one to accurately access the capacity of specific proteins to bind to DNA as well as to determine the exact sequences and DNA contact nucleotides involved. For illustrative purposes, we primarily have used the interaction of the androgen receptor with the rat probasin proximal promoter as our model system.
Resumo:
Complementary sequences at the 5′ and 3′ ends of the dengue virus RNA genome are essential for viral replication, and are believed to cyclise the genome through long-range base pairing in cis. Although consistent with evidence in the literature, this view neglects possible biologically active multimeric forms that are equally consistent with the data. Here, we propose alternative multimeric structures, and suggest that multigenome noncovalent concatemers are more likely to exist under cellular conditions than single cyclised monomers. Concatemers provide a plausible mechanism for the dengue virus to overcome the single-stranded (+)-sense RNA virus dilemma, and can potentially assist genome transport from the virus-induced vesicles into the cytosol.
Resumo:
Acinetobacter baumannii isolate A1 was recovered in the United Kingdom in 1982 and belongs to global clone 1 (GC1). Here, we present its complete 3.91-Mbp genome sequence, generated via a combination of short-read sequencing (Illumina), long-read sequencing (PacBio), and manual finishing.
Resumo:
The current explosion of DNA sequence information has generated increasing evidence for the claim that noncoding repetitive DNA sequences present within and around different genes could play an important role in genetic control processes, although the precise role and mechanism by which these sequences function are poorly understood. Several of the simple repetitive sequences which occur in a large number of loci throughout the human and other eukaryotic genomes satisfy the sequence criteria for forming non-B DNA structures in vitro. We have summarized some of the features of three different types of simple repeats that highlight the importance of repetitive DNA in the control of gene expression and chromatin organization. (i) (TG/CA)n repeats are widespread and conserved in many loci. These sequences are associated with nucleosomes of varying linker length and may play a role in chromatin organization. These Z-potential sequences can help absorb superhelical stress during transcription and aid in recombination. (ii) Human telomeric repeat (TTAGGG)n adopts a novel quadruplex structure and exhibits unusual chromatin organization. This unusual structural motif could explain chromosome pairing and stability. (iii) Intragenic amplification of (CTG)n/(CAG)n trinucleotide repeat, which is now known to be associated with several genetic disorders, could down-regulate gene expression in vivo. The overall implications of these findings vis-à-vis repetitive sequences in the genome are summarized.
Resumo:
This chapter contains sections titled: Introduction Structure and Regulation Physiologic Functions of TG2 Disruption of TG2 Functions in Pathologic Conditions Perspectives for Pharmacologic Interventions Concluding Comments Acknowledgements References
Resumo:
BACKGROUND: The murine ghrelin gene (Ghrl), originally sequenced from stomach tissue, contains five exons and a single transcription start site in a short, 19 bp first exon (exon 0). We recently isolated several novel first exons of the human ghrelin gene and found evidence of a complex transcriptional repertoire. In this report, we examined the 5' exons of the murine ghrelin orthologue in a range of tissues using 5' RACE. -----FINDINGS: 5' RACE revealed two transcription start sites (TSSs) in exon 0 and four TSSs in intron 0, which correspond to 5' extensions of exon 1. Using quantitative, real-time RT-PCR (qRT-PCR), we demonstrated that extended exon 1 containing Ghrl transcripts are largely confined to the spleen, adrenal gland, stomach, and skin. -----CONCLUSION: We demonstrate that multiple transcription start sites are present in exon 0 and an extended exon 1 of the murine ghrelin gene, similar to the proximal first exon organisation of its human orthologue. The identification of several transcription start sites in intron 0 of mouse ghrelin (resulting in an extension of exon 1) raises the possibility that developmental-, cell- and tissue-specific Ghrl mRNA species are created by employing alternative promoters and further studies of the murine ghrelin gene are warranted.
Resumo:
Computational biology increasingly demands the sharing of sophisticated data and annotations between research groups. Web 2.0 style sharing and publication requires that biological systems be described in well-defined, yet flexible and extensible formats which enhance exchange and re-use. In contrast to many of the standards for exchange in the genomic sciences, descriptions of biological sequences show a great diversity in format and function, impeding the definition and exchange of sequence patterns. In this presentation, we introduce BioPatML, an XML-based pattern description language that supports a wide range of patterns and allows the construction of complex, hierarchically structured patterns and pattern libraries. BioPatML unifies the diversity of current pattern description languages and fills a gap in the set of XML-based description languages for biological systems. We discuss the structure and elements of the language, and demonstrate its advantages on a series of applications, showing lightweight integration between the BioPatML parser and search engine, and the SilverGene genome browser. We conclude by describing our site to enable large scale pattern sharing, and our efforts to seed this repository.
Resumo:
Associations between single nucleotide polymorphisms (SNPs) at 5p15 and multiple cancer types have been reported. We have previously shown evidence for a strong association between prostate cancer (PrCa) risk and rs2242652 at 5p15, intronic in the telomerase reverse transcriptase (TERT) gene that encodes TERT. To comprehensively evaluate the association between genetic variation across this region and PrCa, we performed a fine-mapping analysis by genotyping 134 SNPs using a custom Illumina iSelect array or Sequenom MassArray iPlex, followed by imputation of 1094 SNPs in 22 301 PrCa cases and 22 320 controls in The PRACTICAL consortium. Multiple stepwise logistic regression analysis identified four signals in the promoter or intronic regions of TERT that independently associated with PrCa risk. Gene expression analysis of normal prostate tissue showed evidence that SNPs within one of these regions also associated with TERT expression, providing a potential mechanism for predisposition to disease.
Resumo:
Plant genomes are extremely complex. Myriad factors contribute to their evolution and organization, as well as to the expression and regulation of individual genes. Here we present investigations into several such factors and their influence on genome structure and gene expression: the arrangement of pairs of physically adjacent genes, retrotransposons closely associated with genes, and the effect of retrotransposons on gene pair evolution. All sequenced plant genomes contain a significant fraction of retrotransposons, including that of rice. We investigated the effects of retrotransposons within rice genes and within a 1 kb putative promoter region upstream of each gene. We found that approximately one-sixth of all rice genes are closely associated with retrotransposons. Insertions within a gene’s promoter region tend to block gene expression, while retrotransposons within genes promote the existence of alternative splicing forms. We also identified several other trends in retrotransposon insertion and its effects on gene expression. Several studies have previously noted a connection among genes between physical proximity and correlated expression profiles. To determine the degree to which this correlation depends on an exact physical arrangement, we studied the expression and interspecies conservation of convergent and divergent gene pairs in rice, Arabidopsis, and Populus trichocarpa. Correlated expression among gene pairs was quite common in all three species, yet conserved arrangement was rare. However, conservation of gene pair arrangement was significantly more common among pairs with strongly correlated expression levels. In order to uncover additional properties of gene pair conservation and rearrangement, we performed a comparative analysis of convergent, divergent, and tandem gene pairs in rice, sorghum, maize, and Brachypodium. We noted considerable differences between gene pair types and species. We also constructed a putative evolutionary history for each pair, which led to several interesting discoveries. To further elucidate the causes of gene pair conservation and rearrangement, we identified retrotransposon insertions in and near rice gene pairs. Retrotransposon-associated pairs are less likely to be conserved, although there are significant differences in the possible effect of different types and locations of retrotransposon insertions. The three types of gene pair also varied in their susceptibility to retrotransposon-associated evolutionary changes.
Resumo:
Proteasomes are cylindrical particles made up of a stack of four heptameric rings. In animal cells the outer rings are made up of 7 different types of alpha subunits and the inner rings are composed of 7 out of 10 possible different beta subunits. Regulatory complexes can bind to the ends of the cylinder.We have investigated aspects of the assembly, activity and subunit composition of core proteasome particles and 26S proteasomes, the localization of proteasome subpopulations, and the possible role of phosphorylation in determining proteasome localization, activities and association with regulatory components.