2 resultados para Approaches to learning
em DRUM (Digital Repository at the University of Maryland)
Resumo:
Cancer and cardio-vascular diseases are the leading causes of death world-wide. Caused by systemic genetic and molecular disruptions in cells, these disorders are the manifestation of profound disturbance of normal cellular homeostasis. People suffering or at high risk for these disorders need early diagnosis and personalized therapeutic intervention. Successful implementation of such clinical measures can significantly improve global health. However, development of effective therapies is hindered by the challenges in identifying genetic and molecular determinants of the onset of diseases; and in cases where therapies already exist, the main challenge is to identify molecular determinants that drive resistance to the therapies. Due to the progress in sequencing technologies, the access to a large genome-wide biological data is now extended far beyond few experimental labs to the global research community. The unprecedented availability of the data has revolutionized the capabilities of computational researchers, enabling them to collaboratively address the long standing problems from many different perspectives. Likewise, this thesis tackles the two main public health related challenges using data driven approaches. Numerous association studies have been proposed to identify genomic variants that determine disease. However, their clinical utility remains limited due to their inability to distinguish causal variants from associated variants. In the presented thesis, we first propose a simple scheme that improves association studies in supervised fashion and has shown its applicability in identifying genomic regulatory variants associated with hypertension. Next, we propose a coupled Bayesian regression approach -- eQTeL, which leverages epigenetic data to estimate regulatory and gene interaction potential, and identifies combinations of regulatory genomic variants that explain the gene expression variance. On human heart data, eQTeL not only explains a significantly greater proportion of expression variance in samples, but also predicts gene expression more accurately than other methods. We demonstrate that eQTeL accurately detects causal regulatory SNPs by simulation, particularly those with small effect sizes. Using various functional data, we show that SNPs detected by eQTeL are enriched for allele-specific protein binding and histone modifications, which potentially disrupt binding of core cardiac transcription factors and are spatially proximal to their target. eQTeL SNPs capture a substantial proportion of genetic determinants of expression variance and we estimate that 58% of these SNPs are putatively causal. The challenge of identifying molecular determinants of cancer resistance so far could only be dealt with labor intensive and costly experimental studies, and in case of experimental drugs such studies are infeasible. Here we take a fundamentally different data driven approach to understand the evolving landscape of emerging resistance. We introduce a novel class of genetic interactions termed synthetic rescues (SR) in cancer, which denotes a functional interaction between two genes where a change in the activity of one vulnerable gene (which may be a target of a cancer drug) is lethal, but subsequently altered activity of its partner rescuer gene restores cell viability. Next we describe a comprehensive computational framework --termed INCISOR-- for identifying SR underlying cancer resistance. Applying INCISOR to mine The Cancer Genome Atlas (TCGA), a large collection of cancer patient data, we identified the first pan-cancer SR networks, composed of interactions common to many cancer types. We experimentally test and validate a subset of these interactions involving the master regulator gene mTOR. We find that rescuer genes become increasingly activated as breast cancer progresses, testifying to pervasive ongoing rescue processes. We show that SRs can be utilized to successfully predict patients' survival and response to the majority of current cancer drugs, and importantly, for predicting the emergence of drug resistance from the initial tumor biopsy. Our analysis suggests a potential new strategy for enhancing the effectiveness of existing cancer therapies by targeting their rescuer genes to counteract resistance. The thesis provides statistical frameworks that can harness ever increasing high throughput genomic data to address challenges in determining the molecular underpinnings of hypertension, cardiovascular disease and cancer resistance. We discover novel molecular mechanistic insights that will advance the progress in early disease prevention and personalized therapeutics. Our analyses sheds light on the fundamental biological understanding of gene regulation and interaction, and opens up exciting avenues of translational applications in risk prediction and therapeutics.
Resumo:
The fruit is one of the most complex and important structures produced by flowering plants, and understanding the development and maturation process of fruits in different angiosperm species with diverse fruit structures is of immense interest. In the work presented here, molecular genetics and genomic analysis are used to explore the processes that form the fruit in two species: The model organism Arabidopsis and the diploid strawberry Fragaria vesca. One important basic question concerns the molecular genetic basis of fruit patterning. A long-standing model of Arabidopsis fruit (the gynoecium) patterning holds that auxin produced at the apex diffuses downward, forming a gradient that provides apical-basal positional information to specify different tissue types along the gynoecium’s length. The proposed gradient, however, has never been observed and the model appears inconsistent with a number of observations. I present a new, alternative model, wherein auxin acts to establish the adaxial-abaxial domains of the carpel primordia, which then ensures proper development of the final gynoecium. A second project utilizes genomics to identify genes that regulate fruit color by analyzing the genome sequences of Fragaria vesca, a species of wild strawberry. Shared and distinct SNPs among three F. vesca accessions were identified, providing a foundation for locating candidate mutations underlying phenotypic variations among different F. vesca accessions. Through systematic analysis of relevant SNP variants, a candidate SNP in FveMYB10 was identified that may underlie the fruit color in the yellow-fruited accessions, which was subsequently confirmed by functional assays. Our lab has previously generated extensive RNA-sequencing data that depict genome-scale gene expression profiles in F. vesca fruit and flower tissues at different developmental stages. To enhance the accessibility of this dataset, the web-based eFP software was adapted for this dataset, allowing visualization of gene expression in any tissues by user-initiated queries. Together, this thesis work proposes a well-supported new model of fruit patterning in Arabidopsis and provides further resources for F. vesca, including genome-wide variant lists and the ability to visualize gene expression. This work will facilitate future work linking traits of economic importance to specific genes and gaining novel insights into fruit patterning and development.