6 resultados para Bioinformatics

em Duke University


Relevância:

10.00% 10.00%

Publicador:

Resumo:

This thesis focuses on the development of algorithms that will allow protein design calculations to incorporate more realistic modeling assumptions. Protein design algorithms search large sequence spaces for protein sequences that are biologically and medically useful. Better modeling could improve the chance of success in designs and expand the range of problems to which these algorithms are applied. I have developed algorithms to improve modeling of backbone flexibility (DEEPer) and of more extensive continuous flexibility in general (EPIC and LUTE). I’ve also developed algorithms to perform multistate designs, which account for effects like specificity, with provable guarantees of accuracy (COMETS), and to accommodate a wider range of energy functions in design (EPIC and LUTE).

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The ABL family of non-receptor tyrosine kinases, ABL1 (also known as c-ABL) and ABL2 (also known as Arg), links diverse extracellular stimuli to signaling pathways that control cell growth, survival, adhesion, migration and invasion. ABL tyrosine kinases play an oncogenic role in human leukemias. However, the role of ABL kinases in solid tumors including breast cancer progression and metastasis is just emerging.

To evaluate whether ABL family kinases are involved in breast cancer development and metastasis, we first analyzed genomic data from large-scale screen of breast cancer patients. We found that ABL kinases are up-regulated in invasive breast cancer patients and high expression of ABL kinases correlates with poor prognosis and early metastasis. Using xenograft mouse models combined with genetic and pharmacological approaches, we demonstrated that ABL kinases are required for regulating breast cancer progression and metastasis to the bone. Using next generation sequencing and bioinformatics analysis, we uncovered a critical role for ABL kinases in promoting multiple oncogenic pathways including TAZ and STAT5 signaling networks and the epithelial to mesenchymal transition (EMT). These findings revealed a role for ABL kinases in regulating breast cancer tumorigenesis and bone metastasis and provide a rationale for targeting breast tumors with ABL-specific inhibitors.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Constant technology advances have caused data explosion in recent years. Accord- ingly modern statistical and machine learning methods must be adapted to deal with complex and heterogeneous data types. This phenomenon is particularly true for an- alyzing biological data. For example DNA sequence data can be viewed as categorical variables with each nucleotide taking four different categories. The gene expression data, depending on the quantitative technology, could be continuous numbers or counts. With the advancement of high-throughput technology, the abundance of such data becomes unprecedentedly rich. Therefore efficient statistical approaches are crucial in this big data era.

Previous statistical methods for big data often aim to find low dimensional struc- tures in the observed data. For example in a factor analysis model a latent Gaussian distributed multivariate vector is assumed. With this assumption a factor model produces a low rank estimation of the covariance of the observed variables. Another example is the latent Dirichlet allocation model for documents. The mixture pro- portions of topics, represented by a Dirichlet distributed variable, is assumed. This dissertation proposes several novel extensions to the previous statistical methods that are developed to address challenges in big data. Those novel methods are applied in multiple real world applications including construction of condition specific gene co-expression networks, estimating shared topics among newsgroups, analysis of pro- moter sequences, analysis of political-economics risk data and estimating population structure from genotype data.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Transcription factors (TFs) control the temporal and spatial expression of target genes by interacting with DNA in a sequence-specific manner. Recent advances in high throughput experiments that measure TF-DNA interactions in vitro and in vivo have facilitated the identification of DNA binding sites for thousands of TFs. However, it remains unclear how each individual TF achieves its specificity, especially in the case of paralogous TFs that recognize distinct target genomic sites despite sharing very similar DNA binding motifs. In my work, I used a combination of high throughput in vitro protein-DNA binding assays and machine-learning algorithms to characterize and model the binding specificity of 11 paralogous TFs from 4 distinct structural families. My work proves that even very closely related paralogous TFs, with indistinguishable DNA binding motifs, oftentimes exhibit differential binding specificity for their genomic target sites, especially for sites with moderate binding affinity. Importantly, the differences I identify in vitro and through computational modeling help explain, at least in part, the differential in vivo genomic targeting by paralogous TFs. Future work will focus on in vivo factors that might also be important for specificity differences between paralogous TFs, such as DNA methylation, interactions with protein cofactors, or the chromatin environment. In this larger context, my work emphasizes the importance of intrinsic DNA binding specificity in targeting of paralogous TFs to the genome.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Autism spectrum disorders (ASD) are complex heterogeneous neurodevelopmental disorders of an unclear etiology, and no cure currently exists. Prior studies have demonstrated that the black and tan, brachyury (BTBR) T+ Itpr3tf/J mouse strain displays a behavioral phenotype with ASD-like features. BTBR T+ Itpr3tf/J mice (referred to simply as BTBR) display deficits in social functioning, lack of communication ability, and engagement in stereotyped behavior. Despite extensive behavioral phenotypic characterization, little is known about the genes and proteins responsible for the presentation of the ASD-like phenotype in the BTBR mouse model. In this study, we employed bioinformatics techniques to gain a wide-scale understanding of the transcriptomic and proteomic changes associated with the ASD-like phenotype in BTBR mice. We found a number of genes and proteins to be significantly altered in BTBR mice compared to C57BL/6J (B6) control mice controls such as BDNF, Shank3, and ERK1, which are highly relevant to prior investigations of ASD. Furthermore, we identified distinct functional pathways altered in BTBR mice compared to B6 controls that have been previously shown to be altered in both mouse models of ASD, some human clinical populations, and have been suggested as a possible etiological mechanism of ASD, including "axon guidance" and "regulation of actin cytoskeleton." In addition, our wide-scale bioinformatics approach also discovered several previously unidentified genes and proteins associated with the ASD phenotype in BTBR mice, such as Caskin1, suggesting that bioinformatics could be an avenue by which novel therapeutic targets for ASD are uncovered. As a result, we believe that informed use of synergistic bioinformatics applications represents an invaluable tool for elucidating the etiology of complex disorders like ASD.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Quantifying the function of mammalian enhancers at the genome or population scale has been longstanding challenge in the field of gene regulation. Studies of individual enhancers have provided anecdotal evidence on which many foundational assumptions in the field are based. Genome-scale studies have revealed that the number of sites bound by a given transcription factor far outnumber the genes that the factor regulates. In this dissertation we describe a new method, chromatin immune-enriched reporter assays (ChIP-reporters), and use that approach to comprehensively test the enhancer activity of genomic loci bound by the glucocorticoid receptor (GR). Integrative genomics analyses of our ChIP-reporter data revealed an unexpected mechanism of glucocorticoid (GC)-induced gene regulation. In that mechanism, only the minority of GR bound sites acts as GC-inducible enhancers. Many non-GC-inducible GR binding sites interact with GC-induced sites via chromatin looping. These interactions can increase the activity of GC-induced enhancers. Finally, we describe a method that enables the detection and characterization of the functional effects of non-coding genetic variation on enhancer activity at the population scale. Taken together, these studies yield both mechanistic and genetic evidence that provides context that informs the understanding of the effects of multiple enhancer variants on gene expression.