891 resultados para genome-wide association
Resumo:
The complete and faithful duplication of the genome is essential to ensure normal cell division and organismal development. Eukaryotic DNA replication is initiated at multiple sites termed origins of replication that are activated at different time through S phase. The replication timing program is regulated by the S-phase checkpoint, which signals and repairs replicative stress. Eukaryotic DNA is packaged with histones into chromatin, thus DNA-templated processes including replication are modulated by the local chromatin environment such as post-translational modifications (PTMs) of histones.
One such epigenetic mark, methylation of lysine 20 on histone H4 (H4K20), has been linked to chromatin compaction, transcription, DNA repair and DNA replication. H4K20 can be mono-, di- and tri-methylated. Monomethylation of H4K20 (H4K20me1) is mediated by the cell cycle-regulated histone methyltransferase PR-Set7 and subsequent di-/tri- methylation is catalyzed by Suv4-20. Prior studies have shown that PR-Set7 depletion in mammalian cells results in defective S phase progression and the accumulation of DNA damage, which may be partially attributed to defects in origin selection and activation. Meanwhile, overexpression of mammalian PR-Set7 recruits components of pre-Replication Complex (pre-RC) onto chromatin and licenses replication origins for re-replication. However, these studies were limited to only a handful of mammalian origins, and it remains unclear how PR-Set7 impacts the replication program on a genomic scale. Finally, the methylation substrates of PR-Set7 include both histone (H4K20) and non-histone targets, therefore it is necessary to directly test the role of H4K20 methylation in PR-Set7 regulated phenotypes.
I employed genetic, cytological, and genomic approaches to better understand the role of H4K20 methylation in regulating DNA replication and genome stability in Drosophila melanogaster cells. Depletion of Drosophila PR-Set7 by RNAi in cultured Kc167 cells led to an ATR-dependent cell cycle arrest with near 4N DNA content and the accumulation of DNA damage, indicating a defect in completing S phase. The cells were arrested at the second S phase following PR-Set7 downregulation, suggesting that it was an epigenetic effect that coupled to the dilution of histone modification over multiple cell cycles. To directly test the role of H4K20 methylation in regulating genome integrity, I collaborated with the Duronio Lab and observed spontaneous DNA damage on the imaginal wing discs of third instar mutant larvae that had an alanine substitution on H4K20 (H4K20A) thus unable to be methylated, confirming that H4K20 is a bona fide target of PR-Set7 in maintaining genome integrity.
One possible source of DNA damage due to loss of PR-Set7 is reduced origin activity. I used BrdU-seq to profile the genome-wide origin activation pattern. However, I found that deregulation of H4K20 methylation states by manipulating the H4K20 methyltransferases PR-Set7 and Suv4-20 had no impact on origin activation throughout the genome. I then mapped the genomic distribution of DNA damage upon PR-Set7 depletion. Surprisingly, ChIP-seq of the DNA damage marker γ-H2A.v located the DNA damage to late replicating euchromatic regions of the Drosophila genome, and the strength of γ-H2A.v signal was uniformly distributed and spanned the entire late replication domain, implying stochastic replication fork collapse within late replicating regions. Together these data suggest that PR-Set7-mediated monomethylation of H4K20 is critical for maintaining the genomic integrity of late replicating domains, presumably via stabilization of late replicating forks.
In addition to investigating the function of H4K20me, I also used immunofluorescence to characterize the cell cycle regulated chromatin loading of Mcm2-7 complex, the DNA helicase that licenses replication origins, using H4K20me1 level as a proxy for cell cycle stages. In parallel with chromatin spindown data by Powell et al. (Powell et al. 2015), we showed a continuous loading of Mcm2-7 during G1 and a progressive removal from chromatin through S phase.
Resumo:
Previously developed models for predicting absolute risk of invasive epithelial ovarian cancer have included a limited number of risk factors and have had low discriminatory power (area under the receiver operating characteristic curve (AUC) < 0.60). Because of this, we developed and internally validated a relative risk prediction model that incorporates 17 established epidemiologic risk factors and 17 genome-wide significant single nucleotide polymorphisms (SNPs) using data from 11 case-control studies in the United States (5,793 cases; 9,512 controls) from the Ovarian Cancer Association Consortium (data accrued from 1992 to 2010). We developed a hierarchical logistic regression model for predicting case-control status that included imputation of missing data. We randomly divided the data into an 80% training sample and used the remaining 20% for model evaluation. The AUC for the full model was 0.664. A reduced model without SNPs performed similarly (AUC = 0.649). Both models performed better than a baseline model that included age and study site only (AUC = 0.563). The best predictive power was obtained in the full model among women younger than 50 years of age (AUC = 0.714); however, the addition of SNPs increased the AUC the most for women older than 50 years of age (AUC = 0.638 vs. 0.616). Adapting this improved model to estimate absolute risk and evaluating it in prospective data sets is warranted.
Resumo:
Increasing levels of tissue hypoxia have been reported as a natural feature of the aging prostate gland and may be a risk factor for the development of prostate cancer. In this study, we have used PwR-1E benign prostate epithelial cells and an equivalently aged hypoxia-adapted PwR-1E sub-line to identify phenotypic and epigenetic consequences of chronic hypoxia in prostate cells. We have identified a significantly altered cellular phenotype in response to chronic hypoxia as characterized by increased receptor-mediated apoptotic resistance, the induction of cellular senescence, increased invasion and the increased secretion of IL-1 beta, IL6, IL8 and TNFalpha cytokines. In association with these phenotypic changes and the absence of HIF-1 alpha protein expression, we have demonstrated significant increases in global levels of DNA methylation and H3K9 histone acetylation in these cells, concomitant with the increased expression of DNA methyltransferase DMNT3b and gene-specific changes in DNA methylation at key imprinting loci. In conclusion, we have demonstrated a genome-wide adjustment of DNA methylation and histone acetylation under chronic hypoxic conditions in the prostate. These epigenetic signatures may represent an additional mechanism to promote and maintain a hypoxic-adapted cellular phenotype with a potential role in tumour development.
Resumo:
Although epidemiological studies suggest that type 2 diabetes mellitus (T2DM) increases the risk of late-onset Alzheimer's disease (LOAD), the biological basis of this relationship is not well understood. The aim of this study was to examine the genetic comorbidity between the 2 disorders and to investigate whether genetic liability to T2DM, estimated by a genotype risk scores based on T2DM associated loci, is associated with increased risk of LOAD. This study was performed in 2 stages. In stage 1, we combined genotypes for the top 15 T2DM-associated polymorphisms drawn from approximately 3000 individuals (1349 cases and 1351 control subjects) with extracted and/or imputed data from 6 genome-wide studies (>10,000 individuals; 4507 cases, 2183 controls, 4989 population controls) to form a genotype risk score and examined if this was associated with increased LOAD risk in a combined meta-analysis. In stage 2, we investigated the association of LOAD with an expanded T2DM score made of 45 well-established variants drawn from the 6 genome-wide studies. Results were combined in a meta-analysis. Both stage 1 and stage 2 T2DM risk scores were not associated with LOAD risk (odds ratio = 0.988; 95% confidence interval, 0.972-1.004; p = 0.144 and odds ratio = 0.993; 95% confidence interval, 0.983-1.003; p = 0.149 per allele, respectively). Contrary to expectation, genotype risk scores based on established T2DM candidates were not associated with increased risk of LOAD. The observed epidemiological associations between T2DM and LOAD could therefore be a consequence of secondary disease processes, pleiotropic mechanisms, and/or common environmental risk factors. Future work should focus on well-characterized longitudinal cohorts with extensive phenotypic and genetic data relevant to both LOAD and T2DM.
Resumo:
Cancer and cardio-vascular diseases are the leading causes of death world-wide. Caused by systemic genetic and molecular disruptions in cells, these disorders are the manifestation of profound disturbance of normal cellular homeostasis. People suffering or at high risk for these disorders need early diagnosis and personalized therapeutic intervention. Successful implementation of such clinical measures can significantly improve global health. However, development of effective therapies is hindered by the challenges in identifying genetic and molecular determinants of the onset of diseases; and in cases where therapies already exist, the main challenge is to identify molecular determinants that drive resistance to the therapies. Due to the progress in sequencing technologies, the access to a large genome-wide biological data is now extended far beyond few experimental labs to the global research community. The unprecedented availability of the data has revolutionized the capabilities of computational researchers, enabling them to collaboratively address the long standing problems from many different perspectives. Likewise, this thesis tackles the two main public health related challenges using data driven approaches. Numerous association studies have been proposed to identify genomic variants that determine disease. However, their clinical utility remains limited due to their inability to distinguish causal variants from associated variants. In the presented thesis, we first propose a simple scheme that improves association studies in supervised fashion and has shown its applicability in identifying genomic regulatory variants associated with hypertension. Next, we propose a coupled Bayesian regression approach -- eQTeL, which leverages epigenetic data to estimate regulatory and gene interaction potential, and identifies combinations of regulatory genomic variants that explain the gene expression variance. On human heart data, eQTeL not only explains a significantly greater proportion of expression variance in samples, but also predicts gene expression more accurately than other methods. We demonstrate that eQTeL accurately detects causal regulatory SNPs by simulation, particularly those with small effect sizes. Using various functional data, we show that SNPs detected by eQTeL are enriched for allele-specific protein binding and histone modifications, which potentially disrupt binding of core cardiac transcription factors and are spatially proximal to their target. eQTeL SNPs capture a substantial proportion of genetic determinants of expression variance and we estimate that 58% of these SNPs are putatively causal. The challenge of identifying molecular determinants of cancer resistance so far could only be dealt with labor intensive and costly experimental studies, and in case of experimental drugs such studies are infeasible. Here we take a fundamentally different data driven approach to understand the evolving landscape of emerging resistance. We introduce a novel class of genetic interactions termed synthetic rescues (SR) in cancer, which denotes a functional interaction between two genes where a change in the activity of one vulnerable gene (which may be a target of a cancer drug) is lethal, but subsequently altered activity of its partner rescuer gene restores cell viability. Next we describe a comprehensive computational framework --termed INCISOR-- for identifying SR underlying cancer resistance. Applying INCISOR to mine The Cancer Genome Atlas (TCGA), a large collection of cancer patient data, we identified the first pan-cancer SR networks, composed of interactions common to many cancer types. We experimentally test and validate a subset of these interactions involving the master regulator gene mTOR. We find that rescuer genes become increasingly activated as breast cancer progresses, testifying to pervasive ongoing rescue processes. We show that SRs can be utilized to successfully predict patients' survival and response to the majority of current cancer drugs, and importantly, for predicting the emergence of drug resistance from the initial tumor biopsy. Our analysis suggests a potential new strategy for enhancing the effectiveness of existing cancer therapies by targeting their rescuer genes to counteract resistance. The thesis provides statistical frameworks that can harness ever increasing high throughput genomic data to address challenges in determining the molecular underpinnings of hypertension, cardiovascular disease and cancer resistance. We discover novel molecular mechanistic insights that will advance the progress in early disease prevention and personalized therapeutics. Our analyses sheds light on the fundamental biological understanding of gene regulation and interaction, and opens up exciting avenues of translational applications in risk prediction and therapeutics.
Resumo:
The aim of the present study was to propose and evaluate the use of factor analysis (FA) in obtaining latent variables (factors) that represent a set of pig traits simultaneously, for use in genome-wide selection (GWS) studies. We used crosses between outbred F2 populations of Brazilian Piau X commercial pigs. Data were obtained on 345 F2 pigs, genotyped for 237 SNPs, with 41 traits. FA allowed us to obtain four biologically interpretable factors: ?weight?, ?fat?, ?loin?, and ?performance?. These factors were used as dependent variables in multiple regression models of genomic selection (Bayes A, Bayes B, RR-BLUP, and Bayesian LASSO). The use of FA is presented as an interesting alternative to select individuals for multiple variables simultaneously in GWS studies; accuracy measurements of the factors were similar to those obtained when the original traits were considered individually. The similarities between the top 10% of individuals selected by the factor, and those selected by the individual traits, were also satisfactory. Moreover, the estimated markers effects for the traits were similar to those found for the relevant factor.