29 resultados para High-Throughput Nucleotide Sequencing

em Duke University


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The advent of next-generation sequencing, now nearing a decade in age, has enabled, among other capabilities, measurement of genome-wide sequence features at unprecedented scale and resolution.

In this dissertation, I describe work to understand the genetic underpinnings of non-Hodgkin’s lymphoma through exploration of the epigenetics of its cell of origin, initial characterization and interpretation of driver mutations, and finally, a larger-scale, population-level study that incorporates mutation interpretation with clinical outcome.

In the first research chapter, I describe genomic characteristics of lymphomas through the lens of their cells of origin. Just as many other cancers, such as breast cancer or lung cancer, are categorized based on their cell of origin, lymphoma subtypes can be examined through the context of their normal B Cells of origin, Naïve, Germinal Center, and post-Germinal Center. By applying integrative analysis of the epigenetics of normal B Cells of origin through chromatin-immunoprecipitation sequencing, we find that differences in normal B Cell subtypes are reflected in the mutational landscapes of the cancers that arise from them, namely Mantle Cell, Burkitt, and Diffuse Large B-Cell Lymphoma.

In the next research chapter, I describe our first endeavor into understanding the genetic heterogeneity of Diffuse Large B Cell Lymphoma, the most common form of non-Hodgkin’s lymphoma, which affects 100,000 patients in the world. Through whole-genome sequencing of 1 case as well as whole-exome sequencing of 94 cases, we characterize the most recurrent genetic features of DLBCL and lay the groundwork for a larger study.

In the last research chapter, I describe work to characterize and interpret the whole exomes of 1001 cases of DLBCL in the largest single-cancer study to date. This highly-powered study enabled sub-gene, gene-level, and gene-network level understanding of driver mutations within DLBCL. Moreover, matched genomic and clinical data enabled the connection of these driver mutations to clinical features such as treatment response or overall survival. As sequencing costs continue to drop, whole-exome sequencing will become a routine clinical assay, and another diagnostic dimension in addition to existing methods such as histology. However, to unlock the full utility of sequencing data, we must be able to interpret it. This study undertakes a first step in developing the understanding necessary to uncover the genomic signals of DLBCL hidden within its exomes. However, beyond the scope of this one disease, the experimental and analytical methods can be readily applied to other cancer sequencing studies.

Thus, this dissertation leverages next-generation sequencing analysis to understand the genetic underpinnings of lymphoma, both by examining its normal cells of origin as well as through a large-scale study to sensitively identify recurrently mutated genes and their relationship to clinical outcome.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present a novel strategy that uses high-throughput methods of isolating and mapping C. elegans mutants susceptible to pathogen infection. We show that C. elegans mutants that exhibit an enhanced pathogen accumulation (epa) phenotype can be rapidly identified and isolated using a sorting system that allows automation of the analysis, sorting, and dispensing of C. elegans by measuring fluorescent bacteria inside the animals. Furthermore, we validate the use of Amplifluor as a new single nucleotide polymorphism (SNP) mapping technique in C. elegans. We show that a set of 9 SNPs allows the linkage of C. elegans mutants to a 5-8 megabase sub-chromosomal region.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The quantification of protein-ligand interactions is essential for systems biology, drug discovery, and bioengineering. Ligand-induced changes in protein thermal stability provide a general, quantifiable signature of binding and may be monitored with dyes such as Sypro Orange (SO), which increase their fluorescence emission intensities upon interaction with the unfolded protein. This method is an experimentally straightforward, economical, and high-throughput approach for observing thermal melts using commonly available real-time polymerase chain reaction instrumentation. However, quantitative analysis requires careful consideration of the dye-mediated reporting mechanism and the underlying thermodynamic model. We determine affinity constants by analysis of ligand-mediated shifts in melting-temperature midpoint values. Ligand affinity is determined in a ligand titration series from shifts in free energies of stability at a common reference temperature. Thermodynamic parameters are obtained by fitting the inverse first derivative of the experimental signal reporting on thermal denaturation with equations that incorporate linear or nonlinear baseline models. We apply these methods to fit protein melts monitored with SO that exhibit prominent nonlinear post-transition baselines. SO can perturb the equilibria on which it is reporting. We analyze cases in which the ligand binds to both the native and denatured state or to the native state only and cases in which protein:ligand stoichiometry needs to treated explicitly.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Predicting from first-principles calculations whether mixed metallic elements phase-separate or form ordered structures is a major challenge of current materials research. It can be partially addressed in cases where experiments suggest the underlying lattice is conserved, using cluster expansion (CE) and a variety of exhaustive evaluation or genetic search algorithms. Evolutionary algorithms have been recently introduced to search for stable off-lattice structures at fixed mixture compositions. The general off-lattice problem is still unsolved. We present an integrated approach of CE and high-throughput ab initio calculations (HT) applicable to the full range of compositions in binary systems where the constituent elements or the intermediate ordered structures have different lattice types. The HT method replaces the search algorithms by direct calculation of a moderate number of naturally occurring prototypes representing all crystal systems and guides CE calculations of derivative structures. This synergy achieves the precision of the CE and the guiding strengths of the HT. Its application to poorly characterized binary Hf systems, believed to be phase-separating, defines three classes of alloys where CE and HT complement each other to uncover new ordered structures.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The task of nanofabrication can, in principle, be divided into two separate tracks: generation and replication of the patterned features. These two tracks are different in terms of characteristics, requirements, and aspects of emphasis. In general, generation of patterns is commonly achieved in a serial fashion using techniques that are typically slow, making this process only practical for making a small number of copies. Only when combined with a rapid duplication technique will fabrication at high-throughput and low-cost become feasible. Nanoskiving is unique in that it can be used for both generation and duplication of patterned nanostructures.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Rising antibiotic resistance among Escherichia coli, the leading cause of urinary tract infections (UTIs), has placed a new focus on molecular pathogenesis studies, aiming to identify new therapeutic targets. Anti-virulence agents are attractive as chemotherapeutics to attenuate an organism during disease but not necessarily during benign commensalism, thus decreasing the stress on beneficial microbial communities and lessening the emergence of resistance. We and others have demonstrated that the K antigen capsule of E. coli is a preeminent virulence determinant during UTI and more invasive diseases. Components of assembly and export are highly conserved among the major K antigen capsular types associated with UTI-causing E. coli and are distinct from the capsule biogenesis machinery of many commensal E. coli, making these attractive therapeutic targets. We conducted a screen for anti-capsular small molecules and identified an agent designated "C7" that blocks the production of K1 and K5 capsules, unrelated polysaccharide types among the Group 2-3 capsules. Herein lies proof-of-concept that this screen may be implemented with larger chemical libraries to identify second-generation small-molecule inhibitors of capsule biogenesis. These inhibitors will lead to a better understanding of capsule biogenesis and may represent a new class of therapeutics.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The fundamental phenotypes of growth rate, size and morphology are the result of complex interactions between genotype and environment. We developed a high-throughput software application, WormSizer, which computes size and shape of nematodes from brightfield images. Existing methods for estimating volume either coarsely model the nematode as a cylinder or assume the worm shape or opacity is invariant. Our estimate is more robust to changes in morphology or optical density as it only assumes radial symmetry. This open source software is written as a plugin for the well-known image-processing framework Fiji/ImageJ. It may therefore be extended easily. We evaluated the technical performance of this framework, and we used it to analyze growth and shape of several canonical Caenorhabditis elegans mutants in a developmental time series. We confirm quantitatively that a Dumpy (Dpy) mutant is short and fat and that a Long (Lon) mutant is long and thin. We show that daf-2 insulin-like receptor mutants are larger than wild-type upon hatching but grow slow, and WormSizer can distinguish dauer larvae from normal larvae. We also show that a Small (Sma) mutant is actually smaller than wild-type at all stages of larval development. WormSizer works with Uncoordinated (Unc) and Roller (Rol) mutants as well, indicating that it can be used with mutants despite behavioral phenotypes. We used our complete data set to perform a power analysis, giving users a sense of how many images are needed to detect different effect sizes. Our analysis confirms and extends on existing phenotypic characterization of well-characterized mutants, demonstrating the utility and robustness of WormSizer.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Transcriptional regulation has been studied intensively in recent decades. One important aspect of this regulation is the interaction between regulatory proteins, such as transcription factors (TF) and nucleosomes, and the genome. Different high-throughput techniques have been invented to map these interactions genome-wide, including ChIP-based methods (ChIP-chip, ChIP-seq, etc.), nuclease digestion methods (DNase-seq, MNase-seq, etc.), and others. However, a single experimental technique often only provides partial and noisy information about the whole picture of protein-DNA interactions. Therefore, the overarching goal of this dissertation is to provide computational developments for jointly modeling different experimental datasets to achieve a holistic inference on the protein-DNA interaction landscape.

We first present a computational framework that can incorporate the protein binding information in MNase-seq data into a thermodynamic model of protein-DNA interaction. We use a correlation-based objective function to model the MNase-seq data and a Markov chain Monte Carlo method to maximize the function. Our results show that the inferred protein-DNA interaction landscape is concordant with the MNase-seq data and provides a mechanistic explanation for the experimentally collected MNase-seq fragments. Our framework is flexible and can easily incorporate other data sources. To demonstrate this flexibility, we use prior distributions to integrate experimentally measured protein concentrations.

We also study the ability of DNase-seq data to position nucleosomes. Traditionally, DNase-seq has only been widely used to identify DNase hypersensitive sites, which tend to be open chromatin regulatory regions devoid of nucleosomes. We reveal for the first time that DNase-seq datasets also contain substantial information about nucleosome translational positioning, and that existing DNase-seq data can be used to infer nucleosome positions with high accuracy. We develop a Bayes-factor-based nucleosome scoring method to position nucleosomes using DNase-seq data. Our approach utilizes several effective strategies to extract nucleosome positioning signals from the noisy DNase-seq data, including jointly modeling data points across the nucleosome body and explicitly modeling the quadratic and oscillatory DNase I digestion pattern on nucleosomes. We show that our DNase-seq-based nucleosome map is highly consistent with previous high-resolution maps. We also show that the oscillatory DNase I digestion pattern is useful in revealing the nucleosome rotational context around TF binding sites.

Finally, we present a state-space model (SSM) for jointly modeling different kinds of genomic data to provide an accurate view of the protein-DNA interaction landscape. We also provide an efficient expectation-maximization algorithm to learn model parameters from data. We first show in simulation studies that the SSM can effectively recover underlying true protein binding configurations. We then apply the SSM to model real genomic data (both DNase-seq and MNase-seq data). Through incrementally increasing the types of genomic data in the SSM, we show that different data types can contribute complementary information for the inference of protein binding landscape and that the most accurate inference comes from modeling all available datasets.

This dissertation provides a foundation for future research by taking a step toward the genome-wide inference of protein-DNA interaction landscape through data integration.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND: MicroRNAs (miRNAs) are small non-coding RNAs that post-transcriptionally regulate gene expression in a variety of organisms, including insects, vertebrates, and plants. miRNAs play important roles in cell development and differentiation as well as in the cellular response to stress and infection. To date, there are limited reports of miRNA identification in mosquitoes, insects that act as essential vectors for the transmission of many human pathogens, including flaviviruses. West Nile virus (WNV) and dengue virus, members of the Flaviviridae family, are primarily transmitted by Aedes and Culex mosquitoes. Using high-throughput deep sequencing, we examined the miRNA repertoire in Ae. albopictus cells and Cx. quinquefasciatus mosquitoes. RESULTS: We identified a total of 65 miRNAs in the Ae. albopictus C7/10 cell line and 77 miRNAs in Cx. quinquefasciatus mosquitoes, the majority of which are conserved in other insects such as Drosophila melanogaster and Anopheles gambiae. The most highly expressed miRNA in both mosquito species was miR-184, a miRNA conserved from insects to vertebrates. Several previously reported Anopheles miRNAs, including miR-1890 and miR-1891, were also found in Culex and Aedes, and appear to be restricted to mosquitoes. We identified seven novel miRNAs, arising from nine different precursors, in C7/10 cells and Cx. quinquefasciatus mosquitoes, two of which have predicted orthologs in An. gambiae. Several of these novel miRNAs reside within a ~350 nt long cluster present in both Aedes and Culex. miRNA expression was confirmed by primer extension analysis. To determine whether flavivirus infection affects miRNA expression, we infected female Culex mosquitoes with WNV. Two miRNAs, miR-92 and miR-989, showed significant changes in expression levels following WNV infection. CONCLUSIONS: Aedes and Culex mosquitoes are important flavivirus vectors. Recent advances in both mosquito genomics and high-throughput sequencing technologies enabled us to interrogate the miRNA profile in these two species. Here, we provide evidence for over 60 conserved and seven novel mosquito miRNAs, expanding upon our current understanding of insect miRNAs. Undoubtedly, some of the miRNAs identified will have roles not only in mosquito development, but also in mediating viral infection in the mosquito host.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

All organisms live in complex habitats that shape the course of their evolution by altering the phenotype expressed by a given genotype (a phenomenon known as phenotypic plasticity) and simultaneously by determining the evolutionary fitness of that phenotype. In some cases, phenotypic evolution may alter the environment experienced by future generations. This dissertation describes how genetic and environmental variation act synergistically to affect the evolution of glucosinolate defensive chemistry and flowering time in Boechera stricta, a wild perennial herb. I focus particularly on plant-associated microbes as a part of the plant’s environment that may alter trait evolution and in turn be affected by the evolution of those traits. In the first chapter I measure glucosinolate production and reproductive fitness of over 1,500 plants grown in common gardens in four diverse natural habitats, to describe how patterns of plasticity and natural selection intersect and may influence glucosinolate evolution. I detected extensive genetic variation for glucosinolate plasticity and determined that plasticity may aid colonization of new habitats by moving phenotypes in the same direction as natural selection. In the second chapter I conduct a greenhouse experiment to test whether naturally-occurring soil microbial communities contributed to the differences in phenotype and selection that I observed in the field experiment. I found that soil microbes cause plasticity of flowering time but not glucosinolate production, and that they may contribute to natural selection on both traits; thus, non-pathogenic plant-associated microbes are an environmental feature that could shape plant evolution. In the third chapter, I combine a multi-year, multi-habitat field experiment with high-throughput amplicon sequencing to determine whether B. stricta-associated microbial communities are shaped by plant genetic variation. I found that plant genotype predicts the diversity and composition of leaf-dwelling bacterial communities, but not root-associated bacterial communities. Furthermore, patterns of host genetic control over associated bacteria were largely site-dependent, indicating an important role for genotype-by-environment interactions in microbiome assembly. Together, my results suggest that soil microbes influence the evolution of plant functional traits and, because they are sensitive to plant genetic variation, this trait evolution may alter the microbial neighborhood of future B. stricta generations. Complex patterns of plasticity, selection, and symbiosis in natural habitats may impact the evolution of glucosinolate profiles in Boechera stricta.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

DNaseI footprinting is an established assay for identifying transcription factor (TF)-DNA interactions with single base pair resolution. High-throughput DNase-seq assays have recently been used to detect in vivo DNase footprints across the genome. Multiple computational approaches have been developed to identify DNase-seq footprints as predictors of TF binding. However, recent studies have pointed to a substantial cleavage bias of DNase and its negative impact on predictive performance of footprinting. To assess the potential for using DNase-seq to identify individual binding sites, we performed DNase-seq on deproteinized genomic DNA and determined sequence cleavage bias. This allowed us to build bias corrected and TF-specific footprint models. The predictive performance of these models demonstrated that predicted footprints corresponded to high-confidence TF-DNA interactions. DNase-seq footprints were absent under a fraction of ChIP-seq peaks, which we show to be indicative of weaker binding, indirect TF-DNA interactions or possible ChIP artifacts. The modeling approach was also able to detect variation in the consensus motifs that TFs bind to. Finally, cell type specific footprints were detected within DNase hypersensitive sites that are present in multiple cell types, further supporting that footprints can identify changes in TF binding that are not detectable using other strategies.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The advent of digital microfluidic lab-on-a-chip (LoC) technology offers a platform for developing diagnostic applications with the advantages of portability, reduction of the volumes of the sample and reagents, faster analysis times, increased automation, low power consumption, compatibility with mass manufacturing, and high throughput. Moreover, digital microfluidics is being applied in other areas such as airborne chemical detection, DNA sequencing by synthesis, and tissue engineering. In most diagnostic and chemical-detection applications, a key challenge is the preparation of the analyte for presentation to the on-chip detection system. Thus, in diagnostics, raw physiological samples must be introduced onto the chip and then further processed by lysing blood cells and extracting DNA. For massively parallel DNA sequencing, sample preparation can be performed off chip, but the synthesis steps must be performed in a sequential on-chip format by automated control of buffers and nucleotides to extend the read lengths of DNA fragments. In airborne particulate-sampling applications, the sample collection from an air stream must be integrated into the LoC analytical component, which requires a collection droplet to scan an exposed impacted surface after its introduction into a closed analytical section. Finally, in tissue-engineering applications, the challenge for LoC technology is to build high-resolution (less than 10 microns) 3D tissue constructs with embedded cells and growth factors by manipulating and maintaining live cells in the chip platform. This article discusses these applications and their implementation in digital-microfluidic LoC platforms. © 2007 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Lymphomas comprise a diverse group of malignancies derived from immune cells. High throughput sequencing has recently emerged as a powerful and versatile method for analysis of the cancer genome and transcriptome. As these data continue to emerge, the crucial work lies in sorting through the wealth of information to hone in on the critical aspects that will give us a better understanding of biology and new insight for how to treat disease. Finding the important signals within these large data sets is one of the major challenges of next generation sequencing.

In this dissertation, I have developed several complementary strategies to describe the genetic underpinnings of lymphomas. I begin with developing a better method for RNA sequencing that enables strand-specific total RNA sequencing and alternative splicing profiling in the same analysis. I then combine this RNA sequencing technique with whole exome sequencing to better understand the global landscape of aberrations in these diseases. Finally, I use traditional cell and molecular biology techniques to define the consequences of major genetic alterations in lymphoma.

Through this analysis, I find recurrent silencing mutations in the G alpha binding protein GNA13 and associated focal adhesion proteins. I aim to describe how loss-of-function mutations in GNA13 can be oncogenic in the context of germinal center B cell biology. Using in vitro techniques including liquid chromatography-mass spectrometry and knockdown and overexpression of genes in B cell lymphoma cell lines, I determine protein binding partners and downstream effectors of GNA13. I also develop a transgenic mouse model to study the role of GNA13 in the germinal center in vivo to determine effects of GNA13 deletion on germinal center structure and cell migration.

Thus, I have developed complementary approaches that span the spectrum from discovery to context-dependent gene models that afford a better understanding of the biological function of aberrant events and ultimately result in a better understanding of disease.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A large proportion of the variation in traits between individuals can be attributed to variation in the nucleotide sequence of the genome. The most commonly studied traits in human genetics are related to disease and disease susceptibility. Although scientists have identified genetic causes for over 4,000 monogenic diseases, the underlying mechanisms of many highly prevalent multifactorial inheritance disorders such as diabetes, obesity, and cardiovascular disease remain largely unknown. Identifying genetic mechanisms for complex traits has been challenging because most of the variants are located outside of protein-coding regions, and determining the effects of such non-coding variants remains difficult. In this dissertation, I evaluate the hypothesis that such non-coding variants contribute to human traits and diseases by altering the regulation of genes rather than the sequence of those genes. I will specifically focus on studies to determine the functional impacts of genetic variation associated with two related complex traits: gestational hyperglycemia and fetal adiposity. At the genomic locus associated with maternal hyperglycemia, we found that genetic variation in regulatory elements altered the expression of the HKDC1 gene. Furthermore, we demonstrated that HKDC1 phosphorylates glucose in vitro and in vivo, thus demonstrating that HKDC1 is a fifth human hexokinase gene. At the fetal-adiposity associated locus, we identified variants that likely alter VEPH1 expression in preadipocytes during differentiation. To make such studies of regulatory variation high-throughput and routine, we developed POP-STARR, a novel high throughput reporter assay that can empirically measure the effects of regulatory variants directly from patient DNA. By combining targeted genome capture technologies with STARR-seq, we assayed thousands of haplotypes from 760 individuals in a single experiment. We subsequently used POP-STARR to identify three key features of regulatory variants: that regulatory variants typically have weak effects on gene expression; that the effects of regulatory variants are often coordinated with respect to disease-risk, suggesting a general mechanism by which the weak effects can together have phenotypic impact; and that nucleotide transversions have larger impacts on enhancer activity than transitions. Together, the findings presented here demonstrate successful strategies for determining the regulatory mechanisms underlying genetic associations with human traits and diseases, and value of doing so for driving novel biological discovery.