918 resultados para High Throughput
Resumo:
Les écosystèmes dunaires remplissent plusieurs fonctions écologiques essentielles comme celle de protéger le littoral grâce à leur capacité d’amortissement face aux vents et vagues des tempêtes. Les dunes jouent aussi un rôle dans la filtration de l’eau, la recharge de la nappe phréatique, le maintien de la biodiversité, en plus de présenter un attrait culturel, récréatif et touristique. Les milieux dunaires sont très dynamiques et incluent plusieurs stades de succession végétale, passant de la plage de sable nu à la dune bordière stabilisée par l’ammophile à ligule courte, laquelle permet aussi l’établissement d’autres herbacées, d’arbustes et, éventuellement, d’arbres. Or, la survie de ces végétaux est intimement liée aux microorganismes du sol. Les champignons du sol interagissent intimement avec les racines des plantes, modifient la structure des sols, et contribuent à la décomposition de la matière organique et à la disponibilité des nutriments. Ils sont donc des acteurs clés de l’écologie des sols et contribuent à la stabilisation des dunes. Malgré cela, la diversité et la structure des communautés fongiques, ainsi que les mécanismes influençant leur dynamique écologique, demeurent relativement méconnus. Le travail présenté dans cette thèse explore la diversité des communautés fongiques à travers le gradient de succession et de conditions édaphiques d’un écosystème dunaire côtier afin d’améliorer la compréhension de la dynamique des sols en milieux dunaires. Une vaste collecte de données sur le terrain a été réalisée sur une plaine de dunes reliques se trouvant aux Îles de la Madeleine, Qc. J’ai échantillonné plus de 80 sites répartis sur l’ensemble de ce système dunaire et caractérisé les champignons du sol grâce au séquençage à haut débit. Dans un premier temps, j’ai dressé un portait d’ensemble des communautés fongiques du sol à travers les différentes zones des dunes. En plus d’une description taxonomique, les modes de vie fongiques ont été prédits afin de mieux comprendre comment les variations au niveau des communautés de champignons du sol peuvent se traduire en changements fonctionnels. J’ai observé un niveau de diversité fongique élevé (plus de 3400 unités taxonomiques opérationnelles au total) et des communautés taxonomiquement et fonctionnellement distinctes à travers un gradient de succession et de conditions édaphiques. Ces résultats ont aussi indiqué que toutes les zones des dunes, incluant la zone pionière, supportent des communautés fongiques diversifiées. Ensuite, le lien entre les communautés végétales et fongiques a été étudié à travers l’ensemble de la séquence dunaire. Ces résultats ont montré une augmentation claire de la richesse spécifique végétale, ainsi qu’une augmentation de la diversité des stratégies d’acquisition de nutriments (traits souterrains lié à la nutrition des plantes, soit mycorhizien à arbuscule, ectomycorhizien, mycorhizien éricoide, fixateur d’azote ou non spécialisé). J’ai aussi pu établir une forte corrélation entre les champignons du sol et la végétation, qui semblent tous deux réagir de façon similaire aux conditions physicochimiques du sol. Le pH du sol influençait fortement les communautés végétales et fongiques. Le lien observé entre les communautés végétales et fongiques met l’emphase sur l’importance des interactions biotiques positives au fil de la succession dans les environnements pauvres en nutriments. Finalement, j’ai comparé les communautés de champignons ectomycorhiziens associées aux principales espèces arborescentes dans les forêts dunaires. J’ai observé une richesse importante, avec un total de 200 unités taxonomiques opérationnelles ectomycorhiziennes, appartenant principalement aux Agaricomycètes. Une analyse de réseaux n’a pas permis de détecter de modules (c'est-à-dire des sous-groupes d’espèces en interaction), ce qui indique un faible niveau de spécificité des associations ectomycorhiziennes. De plus, je n’ai pas observé de différences en termes de richesse ou de structure des communautés entre les quatre espèces hôtes. En conclusion, j’ai pu observer à travers la succession dunaire des communautés diversifiées et des structures distinctes selon la zone de la dune, tant chez les champignons que chez les plantes. La succession semble toutefois moins marquée au niveau des communautés fongiques, par rapport aux patrons observés chez les plantes. Ces résultats ont alimenté une réflexion sur le potentiel et les perspectives, mais aussi sur les limitations des approches reposant sur le séquençage à haut-débit en écologie microbienne.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08
Resumo:
Metagenomic studies use high-throughput sequence data to investigate microbial communities in situ. However, considerable challenges remain in the analysis of these data, particularly with regard to speed and reliable analysis of microbial species as opposed to higher level taxa such as phyla. We here present Genometa, a computationally undemanding graphical user interface program that enables identification of bacterial species and gene content from datasets generated by inexpensive high-throughput short read sequencing technologies. Our approach was first verified on two simulated metagenomic short read datasets, detecting 100% and 94% of the bacterial species included with few false positives or false negatives. Subsequent comparative benchmarking analysis against three popular metagenomic algorithms on an Illumina human gut dataset revealed Genometa to attribute the most reads to bacteria at species level (i.e. including all strains of that species) and demonstrate similar or better accuracy than the other programs. Lastly, speed was demonstrated to be many times that of BLAST due to the use of modern short read aligners. Our method is highly accurate if bacteria in the sample are represented by genomes in the reference sequence but cannot find species absent from the reference. This method is one of the most user-friendly and resource efficient approaches and is thus feasible for rapidly analysing millions of short reads on a personal computer.
Resumo:
L’Arctique s’est réchauffé rapidement et il y a urgence d’anticiper les effets que cela pourrait avoir sur les protistes à la base de la chaîne alimentaire. Le phytoplancton de l’Océan Arctique inclut les pico- et nano-eucaryotes (0.45-10 μm diamètre de la cellule) et plusieurs de ceux-ci sont des écotypes retrouvés seulement dans l’Arctique alors que d’autres sont introduits des océans plus méridionaux. Alors que les océans tempérés pénètrent dans l’Arctique, il devient pertinent de savoir si ces communautés microbiennes pourraient être modifiées. L’archipel du Svalbard est une région idéale pour observer la biogéographie des communautés microbiennes sous l’influence de processus polaires et tempérés. Bien qu’ils soient géographiquement proches, les régions côtières entourant le Svalbard sont sujettes à des intrusions alternantes de masses d’eau de l’Arctique et de l’Atlantique en plus des conditions locales. Huit sites ont été échantillonnés en juillet 2013 pour identifier les protistes selon un gradient de profondeur et de masses d’eau autour de l’archipel. En plus des variables océanographiques standards, l’eau a été échantillonnée pour synthétiser des banques d’amplicons ciblant le 18S SSU ARNr et son gène pour ensuite être séquencées à haut débit. Cinq des sites d’étude avaient de faibles concentrations de chlorophylle avec des compositions de communauté post-efflorescence dominée par les dinoflagellés, ciliés, des alvéolés parasites putatifs, chlorophycées et prymnesiophytées. L’intrusion des masses d’eau et les conditions environnementales locales étaient corrélées avec la structure des communautés ; l’origine de la masse d’eau contribuant le plus à la distance phylogénétique des communautés microbiennes. Au sein de trois fjords, de fortes concentrations de chlorophylle sous-entendaient des activités d’efflorescence. Un fjord était dominé par Phaeocystis, un deuxième par un clade arctique identifié comme un Pelagophyceae et un troisième par un assemblage mixte. En général, un signal fort d’écotypes liés à l’Arctique prédominait autour du Svalbard.
Resumo:
Objective: The study was designed to validate use of elec-tronic health records (EHRs) for diagnosing bipolar disorder and classifying control subjects. Method: EHR data were obtained from a health care system of more than 4.6 million patients spanning more than 20 years. Experienced clinicians reviewed charts to identify text features and coded data consistent or inconsistent with a diagnosis of bipolar disorder. Natural language processing was used to train a diagnostic algorithm with 95% specificity for classifying bipolar disorder. Filtered coded data were used to derive three additional classification rules for case subjects and one for control subjects. The positive predictive value (PPV) of EHR-based bipolar disorder and subphenotype di- agnoses was calculated against diagnoses from direct semi- structured interviews of 190 patients by trained clinicians blind to EHR diagnosis. Results: The PPV of bipolar disorder defined by natural language processing was 0.85. Coded classification based on strict filtering achieved a value of 0.79, but classifications based on less stringent criteria performed less well. No EHR- classified control subject received a diagnosis of bipolar dis- order on the basis of direct interview (PPV=1.0). For most subphenotypes, values exceeded 0.80. The EHR-based clas- sifications were used to accrue 4,500 bipolar disorder cases and 5,000 controls for genetic analyses. Conclusions: Semiautomated mining of EHRs can be used to ascertain bipolar disorder patients and control subjects with high specificity and predictive value compared with diagnostic interviews. EHRs provide a powerful resource for high-throughput phenotyping for genetic and clinical research.
Diversité microbienne associée au cycle du méthane dans les mares de fonte du pergélisol subarctique
Resumo:
La fonte et l’effondrement du pergélisol riche en glace dans la région subarctique du Québec ont donné lieu à la formation de petits lacs (mares de thermokarst) qui émettent des gaz à effet de serre dans l’atmosphère tels que du dioxyde de carbone et du méthane. Pourtant, la composition de la communauté microbienne qui est à la base des processus biogéochimiques dans les mares de fonte a été très peu étudiée, particulièrement en ce qui concerne la diversité et l’activité des micro-organismes impliqués dans le cycle du méthane. L’objectif de cette thèse est donc d’étudier la diversité phylogénétique et fonctionnelle des micro-organismes dans les mares de fonte subarctiques en lien avec les caractéristiques de l’environnement et les émissions de méthane. Pour ce faire, une dizaine de mares ont été échantillonnées dans quatre vallées situées à travers un gradient de fonte du pergélisol, et disposant de différentes propriétés physico-chimiques. Selon les vallées, les mares peuvent être issues de la fonte de palses (buttes de tourbe, à dominance organique) ou de lithalses (buttes de sol à dominance minérale) ce qui influence la nature du carbone organique disponible pour la reminéralisation microbienne. Durant l’été, les mares étaient fortement stratifiées; il y avait un fort gradient physico-chimique au sein de la colonne d’eau, avec une couche d’eau supérieure oxique et une couche d’eau profonde pauvre en oxygène ou anoxique. Pour identifier les facteurs qui influencent les communautés microbiennes, des techniques de séquençage à haut débit ont été utilisées ciblant les transcrits des gènes de l’ARNr 16S et des gènes impliqués dans le cycle du méthane : mcrA pour la méthanogenèse et pmoA pour la méthanotrophie. Pour évaluer l’activité des micro-organismes, la concentration des transcrits des gènes fonctionnels a aussi été mesurée avec des PCR quantitatives (qPCR). Les résultats montrent une forte dominance de micro-organismes impliqués dans le cycle du méthane, c’est-à-dire des archées méthanogènes et des bactéries méthanotrophes. L’analyse du gène pmoA indique que les bactéries méthanotrophes n’étaient pas seulement actives à la surface, mais aussi dans le fond de la mare où les concentrations en oxygène étaient minimales; ce qui est inattendu compte tenu de leur besoin en oxygène pour consommer le méthane. En général, la composition des communautés microbiennes était principalement influencée par l’origine de la mare (palse ou lithalse), et moins par le gradient de dégradation du pergélisol. Des variables environnementales clefs comme le pH, le phosphore et le carbone organique dissous, contribuent à la distinction des communautés microbiennes entre les mares issues de palses ou de lithalses. Avec l’intensification des effets du réchauffement climatique, ces communautés microbiennes vont faire face à des changements de conditions qui risquent de modifier leur composition taxonomique, et leurs réponses aux changements seront probablement différentes selon le type de mares. De plus, dans le futur les conditions d’oxygénation au sein des mares seront soumises à des modifications majeures associées avec un changement dans la durée des périodes de fonte de glace et de stratification. Ce type de changement aura un impact sur l’équilibre entre la méthanogenèse et la méthanotrophie, et affectera ainsi les taux d’émissions de méthane. Cependant, les résultats obtenus dans cette thèse indiquent que les archées méthanogènes et les bactéries méthanotrophes peuvent développer des stratégies pour survivre et rester actives au-delà des limites de leurs conditions d’oxygène habituelles.
Resumo:
Through recent advances in high-throughput mass spectrometry it has become evident that post-translational N-(epsilon)-lysine-acetylation is a modification found on thousands of proteins of all cellular compartments and all essential physiological processes. Many aspects in the biology of lysine-acetylation are poorly understood, including its regulation by lysine-acetyltransferases and lysine-deacetylases (KDACs). Here, the role of this modification was investigated for the small GTP-binding protein Ran, which, inter alia, is essential for the regulation of nucleocytoplasmic transport. To this end, site-specifically acetylated Ran was produced in E. coli by genetic code expansion. For five previously identified sites, Ran acetylation was tested regarding its impact on the intrinsic GTP hydrolysis rate, the assembly of export complexes (modeled in vitro with the export receptor CRM1 and the export substrate Spn1) and the interaction of Ran with its GTPase activation protein RanGAP and RanBP1. Overall, mild effects of Ran acetylation were observed for intrinsic and RanGAP-stimulated GTP hydrolysis rates. The interaction of active Ran with RanBP1 was negatively influenced by Ran acetylation at K159. Moreover, CRM1 bound to Ran acetylated at K37, K99 or K159 interacted more strongly with Spn1. Thus, lysine-acetylation interferes with essential aspects of Ran function. An in vitro screen was performed to identify potential Ran KDACs. The NAD+-dependent KDACs of the Sirtuin class showed activity towards two acetylation sites of Ran, K37 and K71. The specificity of Sirtuins was further analyzed based on an additional Ran acetylation site, K38. Since deacetylation of RanAcK38 was much slower compared to RanAcK37, di-acetylated RanAcK37/38 was tested next. The deacetylation rate of di-acetylated Ran was comparable to that of RanAcK37. Deacetylation experiments under single turnover conditions revealed that deacetylation occurs first at the K38 site in the di-acetylated RanAcK37/38 background. The ability of Sirtuins to deacetylate two adjacent AcKs was further investigated based on two proteins, which had previously been found to be di-acetylated and targeted by Sirtuins, namely the tumor suppressor protein p53 and phosphoenolpyruvate carboxykinase 1 (PEPCK1). p53 was readily deacetylated at two di-acetylation sites (K372/372 and K381/382), whereas PEPCK1 was not deacetylated in vitro. Taken together, these results have important implications for the substrate specificity of Sirtuins.
Resumo:
The human brain stores, integrates, and transmits information recurring to millions of neurons, interconnected by countless synapses. Though neurons communicate through chemical signaling, information is coded and conducted in the form of electrical signals. Neuroelectrophysiology focus on the study of this type of signaling. Both intra and extracellular approaches are used in research, but none holds as much potential in high-throughput screening and drug discovery, as extracellular recordings using multielectrode arrays (MEAs). MEAs measure neuronal activity, both in vitro and in vivo. Their key advantage is the capability to record electrical activity at multiple sites simultaneously. Alzheimer’s disease (AD) is the most common neurodegenerative disease and one of the leading causes of death worldwide. It is characterized by neurofibrillar tangles and aggregates of amyloid-β (Aβ) peptides, which lead to the loss of synapses and ultimately neuronal death. Currently, there is no cure and the drugs available can only delay its progression. In vitro MEA assays enable rapid screening of neuroprotective and neuroharming compounds. Therefore, MEA recordings are of great use in both AD basic and clinical research. The main aim of this thesis was to optimize the formation of SH-SY5Y neuronal networks on MEAs. These can be extremely useful for facilities that do not have access to primary neuronal cultures, but can also save resources and facilitate obtaining faster high-throughput results to those that do. Adhesion-mediating compounds proved to impact cell morphology, viability and exhibition of spontaneous electrical activity. Moreover, SH-SY5Y cells were successfully differentiated and demonstrated acute effects on neuronal function after Aβ addition. This effect on electrical signaling was dependent on Aβ oligomers concentration. The results here presented allow us to conclude that the SH-SY5Y cell line can be successfully differentiated in properly coated MEAs and be used for assessing acute Aβ effects on neuronal signaling.
Resumo:
The protein folding problem has been one of the most challenging subjects in biological physics due to its complexity. Energy landscape theory based on statistical mechanics provides a thermodynamic interpretation of the protein folding process. We have been working to answer fundamental questions about protein-protein and protein-water interactions, which are very important for describing the energy landscape surface of proteins correctly. At first, we present a new method for computing protein-protein interaction potentials of solvated proteins directly from SAXS data. An ensemble of proteins was modeled by Metropolis Monte Carlo and Molecular Dynamics simulations, and the global X-ray scattering of the whole model ensemble was computed at each snapshot of the simulation. The interaction potential model was optimized and iterated by a Levenberg-Marquardt algorithm. Secondly, we report that terahertz spectroscopy directly probes hydration dynamics around proteins and determines the size of the dynamical hydration shell. We also present the sequence and pH-dependence of the hydration shell and the effect of the hydrophobicity. On the other hand, kinetic terahertz absorption (KITA) spectroscopy is introduced to study the refolding kinetics of ubiquitin and its mutants. KITA results are compared to small angle X-ray scattering, tryptophan fluorescence, and circular dichroism results. We propose that KITA monitors the rearrangement of hydrogen bonding during secondary structure formation. Finally, we present development of the automated single molecule operating system (ASMOS) for a high throughput single molecule detector, which levitates a single protein molecule in a 10 µm diameter droplet by the laser guidance. I also have performed supporting calculations and simulations with my own program codes.
Resumo:
In today's fast-paced and interconnected digital world, the data generated by an increasing number of applications is being modeled as dynamic graphs. The graph structure encodes relationships among data items, while the structural changes to the graphs as well as the continuous stream of information produced by the entities in these graphs make them dynamic in nature. Examples include social networks where users post status updates, images, videos, etc.; phone call networks where nodes may send text messages or place phone calls; road traffic networks where the traffic behavior of the road segments changes constantly, and so on. There is a tremendous value in storing, managing, and analyzing such dynamic graphs and deriving meaningful insights in real-time. However, a majority of the work in graph analytics assumes a static setting, and there is a lack of systematic study of the various dynamic scenarios, the complexity they impose on the analysis tasks, and the challenges in building efficient systems that can support such tasks at a large scale. In this dissertation, I design a unified streaming graph data management framework, and develop prototype systems to support increasingly complex tasks on dynamic graphs. In the first part, I focus on the management and querying of distributed graph data. I develop a hybrid replication policy that monitors the read-write frequencies of the nodes to decide dynamically what data to replicate, and whether to do eager or lazy replication in order to minimize network communication and support low-latency querying. In the second part, I study parallel execution of continuous neighborhood-driven aggregates, where each node aggregates the information generated in its neighborhoods. I build my system around the notion of an aggregation overlay graph, a pre-compiled data structure that enables sharing of partial aggregates across different queries, and also allows partial pre-computation of the aggregates to minimize the query latencies and increase throughput. Finally, I extend the framework to support continuous detection and analysis of activity-based subgraphs, where subgraphs could be specified using both graph structure as well as activity conditions on the nodes. The query specification tasks in my system are expressed using a set of active structural primitives, which allows the query evaluator to use a set of novel optimization techniques, thereby achieving high throughput. Overall, in this dissertation, I define and investigate a set of novel tasks on dynamic graphs, design scalable optimization techniques, build prototype systems, and show the effectiveness of the proposed techniques through extensive evaluation using large-scale real and synthetic datasets.
Resumo:
Puccinia psidii (Myrtle rust) is an emerging pathogen that has a wide host range in the Myrtaceae family; it continues to show an increase in geographic range and is considered to be a significant threat to Myrtaceae plants worldwide. In this study, we describe the development and validation of three novel real-time polymerase reaction (qPCR) assays using ribosomal DNA and β-tubulin gene sequences to detect P. psidii. All qPCR assays were able to detect P. psidii DNA extracted from urediniospores and from infected plants, including asymptomatic leaf tissues. Depending on the gene target, qPCR was able to detect down to 0.011 pg of P. psidii DNA. The most optimum qPCR assay was shown to be highly specific, repeatable, and reproducible following testing using different qPCR reagents and real-time PCR platforms in different laboratories. In addition, a duplex qPCR assay was developed to allow coamplification of the cytochrome oxidase gene from host plants for use as an internal PCR control. The most optimum qPCR assay proved to be faster and more sensitive than the previously published nested PCR assay and will be particularly useful for high-throughput testing and to detect P. psidii at the early stages of infection, before the development of sporulating rust pustules.
Resumo:
Metabolism in an environment containing of 21% oxygen has a high risk of oxidative damage due to the formation of reactive oxygen species. Therefore, plants have evolved an antioxidant system consisting of metabolites and enzymes that either directly scavenge ROS or recycle the antioxidant metabolites. Ozone is a temporally dynamic molecule that is both naturally occurring as well as an environmental pollutant that is predicted to increase in concentration in the future as anthropogenic precursor emissions rise. It has been hypothesized that any elevation in ozone concentration will cause increased oxidative stress in plants and therefore enhanced subsequent antioxidant metabolism, but evidence for this response is variable. Along with increasing atmospheric ozone concentrations, atmospheric carbon dioxide concentration is also rising and is predicted to continue rising in the future. The effect of elevated carbon dioxide concentrations on antioxidant metabolism varies among different studies in the literature. Therefore, the question of how antioxidant metabolism will be affected in the most realistic future atmosphere, with increased carbon dioxide concentration and increased ozone concentration, has yet to be answered, and is the subject of my thesis research. First, in order to capture as much of the variability in the antioxidant system as possible, I developed a suite of high-throughput quantitative assays for a variety of antioxidant metabolites and enzymes. I optimized these assays for Glycine max (soybean), one of the most important food crops in the world. These assays provide accurate, rapid and high-throughput measures of both the general and specific antioxidant action of plant tissue extracts. Second, I investigated how growth at either elevated carbon dioxide concentration or chronic elevated ozone concentration altered antioxidant metabolism, and the ability of soybean to respond to an acute oxidative stress in a controlled environment study. I found that growth at chronic elevated ozone concentration increased the antioxidant capacity of leaves, but was unchanged or only slightly increased following an acute oxidative stress, suggesting that growth at chronic elevated ozone concentration primed the antioxidant system. Growth at high carbon dioxide concentration decreased the antioxidant capacity of leaves, increased the response of the existing antioxidant enzymes to an acute oxidative stress, but dampened and delayed the transcriptional response, suggesting an entirely different regulation of the antioxidant system. Third, I tested the findings from the controlled environment study in a field setting by investigating the response of the soybean antioxidant system to growth at elevated carbon dioxide concentration, chronic elevated ozone concentration and the combination of elevated carbon dioxide concentration and elevated ozone concentration. In this study, I confirmed that growth at elevated carbon dioxide concentration decreased specific components of antioxidant metabolism in the field. I also verified that increasing ozone concentration is highly correlated with increases in the metabolic and genomic components of antioxidant metabolism, regardless of carbon dioxide concentration environment, but that the response to increasing ozone concentration was dampened at elevated carbon dioxide concentration. In addition, I found evidence suggesting an up regulation of respiratory metabolism at higher ozone concentration, which would supply energy and carbon for detoxification and repair of cellular damage. These results consistently support the conclusion that growth at elevated carbon dioxide concentration decreases antioxidant metabolism while growth at elevated ozone concentration increases antioxidant metabolism.
Resumo:
Cancer and cardio-vascular diseases are the leading causes of death world-wide. Caused by systemic genetic and molecular disruptions in cells, these disorders are the manifestation of profound disturbance of normal cellular homeostasis. People suffering or at high risk for these disorders need early diagnosis and personalized therapeutic intervention. Successful implementation of such clinical measures can significantly improve global health. However, development of effective therapies is hindered by the challenges in identifying genetic and molecular determinants of the onset of diseases; and in cases where therapies already exist, the main challenge is to identify molecular determinants that drive resistance to the therapies. Due to the progress in sequencing technologies, the access to a large genome-wide biological data is now extended far beyond few experimental labs to the global research community. The unprecedented availability of the data has revolutionized the capabilities of computational researchers, enabling them to collaboratively address the long standing problems from many different perspectives. Likewise, this thesis tackles the two main public health related challenges using data driven approaches. Numerous association studies have been proposed to identify genomic variants that determine disease. However, their clinical utility remains limited due to their inability to distinguish causal variants from associated variants. In the presented thesis, we first propose a simple scheme that improves association studies in supervised fashion and has shown its applicability in identifying genomic regulatory variants associated with hypertension. Next, we propose a coupled Bayesian regression approach -- eQTeL, which leverages epigenetic data to estimate regulatory and gene interaction potential, and identifies combinations of regulatory genomic variants that explain the gene expression variance. On human heart data, eQTeL not only explains a significantly greater proportion of expression variance in samples, but also predicts gene expression more accurately than other methods. We demonstrate that eQTeL accurately detects causal regulatory SNPs by simulation, particularly those with small effect sizes. Using various functional data, we show that SNPs detected by eQTeL are enriched for allele-specific protein binding and histone modifications, which potentially disrupt binding of core cardiac transcription factors and are spatially proximal to their target. eQTeL SNPs capture a substantial proportion of genetic determinants of expression variance and we estimate that 58% of these SNPs are putatively causal. The challenge of identifying molecular determinants of cancer resistance so far could only be dealt with labor intensive and costly experimental studies, and in case of experimental drugs such studies are infeasible. Here we take a fundamentally different data driven approach to understand the evolving landscape of emerging resistance. We introduce a novel class of genetic interactions termed synthetic rescues (SR) in cancer, which denotes a functional interaction between two genes where a change in the activity of one vulnerable gene (which may be a target of a cancer drug) is lethal, but subsequently altered activity of its partner rescuer gene restores cell viability. Next we describe a comprehensive computational framework --termed INCISOR-- for identifying SR underlying cancer resistance. Applying INCISOR to mine The Cancer Genome Atlas (TCGA), a large collection of cancer patient data, we identified the first pan-cancer SR networks, composed of interactions common to many cancer types. We experimentally test and validate a subset of these interactions involving the master regulator gene mTOR. We find that rescuer genes become increasingly activated as breast cancer progresses, testifying to pervasive ongoing rescue processes. We show that SRs can be utilized to successfully predict patients' survival and response to the majority of current cancer drugs, and importantly, for predicting the emergence of drug resistance from the initial tumor biopsy. Our analysis suggests a potential new strategy for enhancing the effectiveness of existing cancer therapies by targeting their rescuer genes to counteract resistance. The thesis provides statistical frameworks that can harness ever increasing high throughput genomic data to address challenges in determining the molecular underpinnings of hypertension, cardiovascular disease and cancer resistance. We discover novel molecular mechanistic insights that will advance the progress in early disease prevention and personalized therapeutics. Our analyses sheds light on the fundamental biological understanding of gene regulation and interaction, and opens up exciting avenues of translational applications in risk prediction and therapeutics.
Resumo:
Dengue fever is one of the most important mosquito-borne diseases worldwide and is caused by infection with dengue virus (DENV). The disease is endemic in tropical and sub-tropical regions and has increased remarkably in the last few decades. At present, there is no antiviral or approved vaccine against the virus. Treatment of dengue patients is usually supportive, through oral or intravenous rehydration, or by blood transfusion for more severe dengue cases. Infection of DENV in humans and mosquitoes involves a complex interplay between the virus and host factors. This results in regulation of numerous intracellular processes, such as signal transduction and gene transcription which leads to progression of disease. To understand the mechanisms underlying the disease, the study of virus and host factors is therefore essential and could lead to the identification of human proteins modulating an essential step in the virus life cycle. Knowledge of these human proteins could lead to the discovery of potential new drug targets and disease control strategies in the future. Recent advances of high throughput screening technologies have provided researchers with molecular tools to carry out investigations on a large scale. Several studies have focused on determination of the host factors during DENV infection in human and mosquito cells. For instance, a genome-wide RNA interference (RNAi) screen has identified host factors that potentially play an important role in both DENV and West Nile virus replication (Krishnan et al. 2008). In the present study, a high-throughput yeast two-hybrid screen has been utilised in order to identify human factors interacting with DENV non-structural proteins. From the screen, 94 potential human interactors were identified. These include proteins involved in immune signalling regulation, potassium voltage-gated channels, transcriptional regulators, protein transporters and endoplasmic reticulum-associated proteins. Validation of fifteen of these human interactions revealed twelve of them strongly interacted with DENV proteins. Two proteins of particular interest were selected for further investigations of functional biological systems at the molecular level. These proteins, including a nuclear-associated protein BANP and a voltage-gated potassium channel Kv1.3, both have been identified through interaction with the DENV NS2A. BANP is known to be involved in NF-kB immune signalling pathway, whereas, Kv1.3 is known to play an important role in regulating passive flow of potassium ions upon changes in the cell transmembrane potential. This study also initiated a construction of an Aedes aegypti cDNA library for use with DENV proteins in Y2H screen. However, several issues were encountered during the study which made the library unsuitable for protein interaction analysis. In parallel, innate immune signalling was also optimised for downstream analysis. Overall, the work presented in this thesis, in particular the Y2H screen provides a number of human factors potentially targeted by DENV during infection. Nonetheless, more work is required to be done in order to validate these proteins and determine their functional properties, as well as testing them with infectious DENV to establish a biological significance. In the long term, data from this study will be useful for investigating potential human factors for development of antiviral strategies against dengue.
Resumo:
Membrane proteins, which reside in the membranes of cells, play a critical role in many important biological processes including cellular signaling, immune response, and material and energy transduction. Because of their key role in maintaining the environment within cells and facilitating intercellular interactions, understanding the function of these proteins is of tremendous medical and biochemical significance. Indeed, the malfunction of membrane proteins has been linked to numerous diseases including diabetes, cirrhosis of the liver, cystic fibrosis, cancer, Alzheimer's disease, hypertension, epilepsy, cataracts, tubulopathy, leukodystrophy, Leigh syndrome, anemia, sensorineural deafness, and hypertrophic cardiomyopathy.1-3 However, the structure of many of these proteins and the changes in their structure that lead to disease-related malfunctions are not well understood. Additionally, at least 60% of the pharmaceuticals currently available are thought to target membrane proteins, despite the fact that their exact mode of operation is not known.4-6 Developing a detailed understanding of the function of a protein is achieved by coupling biochemical experiments with knowledge of the structure of the protein. Currently the most common method for obtaining three-dimensional structure information is X-ray crystallography. However, no a priori methods are currently available to predict crystallization conditions for a given protein.7-14 This limitation is currently overcome by screening a large number of possible combinations of precipitants, buffer, salt, and pH conditions to identify conditions that are conducive to crystal nucleation and growth.7,9,11,15-24 Unfortunately, these screening efforts are often limited by difficulties associated with quantity and purity of available protein samples. While the two most significant bottlenecks for protein structure determination in general are the (i) obtaining sufficient quantities of high quality protein samples and (ii) growing high quality protein crystals that are suitable for X-ray structure determination,7,20,21,23,25-47 membrane proteins present additional challenges. For crystallization it is necessary to extract the membrane proteins from the cellular membrane. However, this process often leads to denaturation. In fact, membrane proteins have proven to be so difficult to crystallize that of the more than 66,000 structures deposited in the Protein Data Bank,48 less than 1% are for membrane proteins, with even fewer present at high resolution (< 2Å)4,6,49 and only a handful are human membrane proteins.49 A variety of strategies including detergent solubilization50-53 and the use of artificial membrane-like environments have been developed to circumvent this challenge.43,53-55 In recent years, the use of a lipidic mesophase as a medium for crystallizing membrane proteins has been demonstrated to increase success for a wide range of membrane proteins, including human receptor proteins.54,56-62 This in meso method for membrane protein crystallization, however, is still by no means routine due to challenges related to sample preparation at sub-microliter volumes and to crystal harvesting and X-ray data collection. This dissertation presents various aspects of the development of a microfluidic platform to enable high throughput in meso membrane protein crystallization at a level beyond the capabilities of current technologies. Microfluidic platforms for protein crystallization and other lab-on-a-chip applications have been well demonstrated.9,63-66 These integrated chips provide fine control over transport phenomena and the ability to perform high throughput analyses via highly integrated fluid networks. However, the development of microfluidic platforms for in meso protein crystallization required the development of strategies to cope with extremely viscous and non-Newtonian fluids. A theoretical treatment of highly viscous fluids in microfluidic devices is presented in Chapter 3, followed by the application of these strategies for the development of a microfluidic mixer capable of preparing a mesophase sample for in meso crystallization at a scale of less than 20 nL in Chapter 4. This approach was validated with the successful on chip in meso crystallization of the membrane protein bacteriorhodopsin. In summary, this is the first report of a microfluidic platform capable of performing in meso crystallization on-chip, representing a 1000x reduction in the scale at which mesophase trials can be prepared. Once protein crystals have formed, they are typically harvested from the droplet they were grown in and mounted for crystallographic analysis. Despite the high throughput automation present in nearly all other aspects of protein structure determination, the harvesting and mounting of crystals is still largely a manual process. Furthermore, during mounting the fragile protein crystals can potentially be damaged, both from physical and environmental shock. To circumvent these challenges an X-ray transparent microfluidic device architecture was developed to couple the benefits of scale, integration, and precise fluid control with the ability to perform in situ X-ray analysis (Chapter 5). This approach was validated successfully by crystallization and subsequent on-chip analysis of the soluble proteins lysozyme, thaumatin, and ribonuclease A and will be extended to microfluidic platforms for in meso membrane protein crystallization. The ability to perform in situ X-ray analysis was shown to provide extremely high quality diffraction data, in part as a result of not being affected by damage due to physical handling of the crystals. As part of the work described in this thesis, a variety of data collection strategies for in situ data analysis were also tested, including merging of small slices of data from a large number of crystals grown on a single chip, to allow for diffraction analysis at biologically relevant temperatures. While such strategies have been applied previously,57,59,61,67 they are potentially challenging when applied via traditional methods due to the need to grow and then mount a large number of crystals with minimal crystal-to-crystal variability. The integrated nature of microfluidic platforms easily enables the generation of a large number of reproducible crystallization trials. This, coupled with in situ analysis capabilities has the potential of being able to acquire high resolution structural data of proteins at biologically relevant conditions for which only small crystals, or crystals which are adversely affected by standard cryocooling techniques, could be obtained (Chapters 5 and 6). While the main focus of protein crystallography is to obtain three-dimensional protein structures, the results of typical experiments provide only a static picture of the protein. The use of polychromatic or Laue X-ray diffraction methods enables the collection of time resolved structural information. These experiments are very sensitive to crystal quality, however, and often suffer from severe radiation damage due to the intense polychromatic X-ray beams. Here, as before, the ability to perform in situ X-ray analysis on many small protein crystals within a microfluidic crystallization platform has the potential to overcome these challenges. An automated method for collecting a "single-shot" of data from a large number of crystals was developed in collaboration with the BioCARS team at the Advanced Photon Source at Argonne National Laboratory (Chapter 6). The work described in this thesis shows that, even more so than for traditional structure determination efforts, the ability to grow and analyze a large number of high quality crystals is critical to enable time resolved structural studies of novel proteins. In addition to enabling X-ray crystallography experiments, the development of X-ray transparent microfluidic platforms also has tremendous potential to answer other scientific questions, such as unraveling the mechanism of in meso crystallization. For instance, the lipidic mesophases utilized during in meso membrane protein crystallization can be characterized by small angle X-ray diffraction analysis. Coupling in situ analysis with microfluidic platforms capable of preparing these difficult mesophase samples at very small volumes has tremendous potential to enable the high throughput analysis of these systems on a scale that is not reasonably achievable using conventional sample preparation strategies (Chapter 7). In collaboration with the LS-CAT team at the Advanced Photon Source, an experimental station for small angle X-ray analysis coupled with the high quality visualization capabilities needed to target specific microfluidic samples on a highly integrated chip is under development. Characterizing the phase behavior of these mesophase systems and the effects of various additives present in crystallization trials is key for developing an understanding of how in meso crystallization occurs. A long term goal of these studies is to enable the rational design of in meso crystallization experiments so as to avoid or limit the need for high throughput screening efforts. In summary, this thesis describes the development of microfluidic platforms for protein crystallization with in situ analysis capabilities. Coupling the ability to perform in situ analysis with the small scale, fine control, and the high throughput nature of microfluidic platforms has tremendous potential to enable a new generation of crystallographic studies and facilitate the structure determination of important biological targets. The development of platforms for in meso membrane protein crystallization is particularly significant because they enable the preparation of highly viscous mixtures at a previously unachievable scale. Work in these areas is ongoing and has tremendous potential to improve not only current the methods of protein crystallization and crystallography, but also to enhance our knowledge of the structure and function of proteins which could have a significant scientific and medical impact on society as a whole. The microfluidic technology described in this thesis has the potential to significantly advance our understanding of the structure and function of membrane proteins, thereby aiding the elucidation of human biology, the development of pharmaceuticals with fewer side effects for a wide range of diseases. References (1) Quick, M.; Javitch, J. A. P Natl Acad Sci USA 2007, 104, 3603. (2) Trubetskoy, V. S.; Burke, T. J. Am Lab 2005, 37, 19. (3) Pecina, P.; Houstkova, H.; Hansikova, H.; Zeman, J.; Houstek, J. Physiol Res 2004, 53, S213. (4) Arinaminpathy, Y.; Khurana, E.; Engelman, D. M.; Gerstein, M. B. Drug Discovery Today 2009, 14, 1130. (5) Overington, J. P.; Al-Lazikani, B.; Hopkins, A. L. Nat Rev Drug Discov 2006, 5, 993. (6) Dauter, Z.; Lamzin, V. S.; Wilson, K. S. Current Opinion in Structural Biology 1997, 7, 681. (7) Hansen, C.; Quake, S. R. Current Opinion in Structural Biology 2003, 13, 538. (8) Govada, L.; Carpenter, L.; da Fonseca, P. C. A.; Helliwell, J. R.; Rizkallah, P.; Flashman, E.; Chayen, N. E.; Redwood, C.; Squire, J. M. J Mol Biol 2008, 378, 387. (9) Hansen, C. L.; Skordalakes, E.; Berger, J. M.; Quake, S. R. P Natl Acad Sci USA 2002, 99, 16531. (10) Leng, J.; Salmon, J.-B. Lab Chip 2009, 9, 24. (11) Zheng, B.; Gerdts, C. J.; Ismagilov, R. F. Current Opinion in Structural Biology 2005, 15, 548. (12) Lorber, B.; Delucas, L. J.; Bishop, J. B. J Cryst Growth 1991, 110, 103. (13) Talreja, S.; Perry, S. L.; Guha, S.; Bhamidi, V.; Zukoski, C. F.; Kenis, P. J. A. The Journal of Physical Chemistry B 2010, 114, 4432. (14) Chayen, N. E. Current Opinion in Structural Biology 2004, 14, 577. (15) He, G. W.; Bhamidi, V.; Tan, R. B. H.; Kenis, P. J. A.; Zukoski, C. F. Cryst Growth Des 2006, 6, 1175. (16) Zheng, B.; Tice, J. D.; Roach, L. S.; Ismagilov, R. F. Angew Chem Int Edit 2004, 43, 2508. (17) Li, L.; Mustafi, D.; Fu, Q.; Tereshko, V.; Chen, D. L. L.; Tice, J. D.; Ismagilov, R. F. P Natl Acad Sci USA 2006, 103, 19243. (18) Song, H.; Chen, D. L.; Ismagilov, R. F. Angew Chem Int Edit 2006, 45, 7336. (19) van der Woerd, M.; Ferree, D.; Pusey, M. Journal of Structural Biology 2003, 142, 180. (20) Ng, J. D.; Gavira, J. A.; Garcia-Ruiz, J. M. Journal of Structural Biology 2003, 142, 218. (21) Talreja, S.; Kenis, P. J. A.; Zukoski, C. F. Langmuir 2007, 23, 4516. (22) Hansen, C. L.; Quake, S. R.; Berger, J. M. US, 2007. (23) Newman, J.; Fazio, V. J.; Lawson, B.; Peat, T. S. Cryst Growth Des 2010, 10, 2785. (24) Newman, J.; Xu, J.; Willis, M. C. Acta Crystallographica Section D 2007, 63, 826. (25) Collingsworth, P. D.; Bray, T. L.; Christopher, G. K. J Cryst Growth 2000, 219, 283. (26) Durbin, S. D.; Feher, G. Annu Rev Phys Chem 1996, 47, 171. (27) Talreja, S.; Kim, D. Y.; Mirarefi, A. Y.; Zukoski, C. F.; Kenis, P. J. A. J Appl Crystallogr 2005, 38, 988. (28) Yoshizaki, I.; Nakamura, H.; Sato, T.; Igarashi, N.; Komatsu, H.; Yoda, S. J Cryst Growth 2002, 237, 295. (29) Anderson, M. J.; Hansen, C. L.; Quake, S. R. P Natl Acad Sci USA 2006, 103, 16746. (30) Hansen, C. L.; Sommer, M. O. A.; Quake, S. R. P Natl Acad Sci USA 2004, 101, 14431. (31) Lounaci, M.; Rigolet, P.; Abraham, C.; Le Berre, M.; Chen, Y. Microelectron Eng 2007, 84, 1758. (32) Zheng, B.; Roach, L. S.; Ismagilov, R. F. J Am Chem Soc 2003, 125, 11170. (33) Zhou, X.; Lau, L.; Lam, W. W. L.; Au, S. W. N.; Zheng, B. Anal. Chem. 2007. (34) Cherezov, V.; Caffrey, M. J Appl Crystallogr 2003, 36, 1372. (35) Qutub, Y.; Reviakine, I.; Maxwell, C.; Navarro, J.; Landau, E. M.; Vekilov, P. G. J Mol Biol 2004, 343, 1243. (36) Rummel, G.; Hardmeyer, A.; Widmer, C.; Chiu, M. L.; Nollert, P.; Locher, K. P.; Pedruzzi, I.; Landau, E. M.; Rosenbusch, J. P. Journal of Structural Biology 1998, 121, 82. (37) Gavira, J. A.; Toh, D.; Lopez-Jaramillo, J.; Garcia-Ruiz, J. M.; Ng, J. D. Acta Crystallogr D 2002, 58, 1147. (38) Stevens, R. C. Current Opinion in Structural Biology 2000, 10, 558. (39) Baker, M. Nat Methods 2010, 7, 429. (40) McPherson, A. In Current Topics in Membranes, Volume 63; Volume 63 ed.; DeLucas, L., Ed.; Academic Press: 2009, p 5. (41) Gabrielsen, M.; Gardiner, A. T.; Fromme, P.; Cogdell, R. J. In Current Topics in Membranes, Volume 63; Volume 63 ed.; DeLucas, L., Ed.; Academic Press: 2009, p 127. (42) Page, R. In Methods in Molecular Biology: Structural Proteomics - High Throughput Methods; Kobe, B., Guss, M., Huber, T., Eds.; Humana Press: Totowa, NJ, 2008; Vol. 426, p 345. (43) Caffrey, M. Ann Rev Biophys 2009, 38, 29. (44) Doerr, A. Nat Methods 2006, 3, 244. (45) Brostromer, E.; Nan, J.; Li, L.-F.; Su, X.-D. Biochemical and Biophysical Research Communications 2009, 386, 634. (46) Li, G.; Chen, Q.; Li, J.; Hu, X.; Zhao, J. Anal Chem 2010, 82, 4362. (47) Jia, Y.; Liu, X.-Y. The Journal of Physical Chemistry B 2006, 110, 6949. (48) RCSB Protein Data Bank. http://www.rcsb.org/ (July 11, 2010). (49) Membrane Proteins of Known 3D Structure. http://blanco.biomol.uci.edu/Membrane_Proteins_xtal.html (July 11, 2010). (50) Michel, H. Trends Biochem Sci 1983, 8, 56. (51) Rosenbusch, J. P. Journal of Structural Biology 1990, 104, 134. (52) Garavito, R. M.; Picot, D. Methods 1990, 1, 57. (53) Kulkarni, C. V. 2010; Vol. 12, p 237. (54) Landau, E. M.; Rosenbusch, J. P. P Natl Acad Sci USA 1996, 93, 14532. (55) Pebay-Peyroula, E.; Rummel, G.; Rosenbusch, J. P.; Landau, E. M. Science 1997, 277, 1676. (56) Cherezov, V.; Liu, W.; Derrick, J. P.; Luan, B.; Aksimentiev, A.; Katritch, V.; Caffrey, M. Proteins: Structure, Function, and Bioinformatics 2008, 71, 24. (57) Cherezov, V.; Rosenbaum, D. M.; Hanson, M. A.; Rasmussen, S. G. F.; Thian, F. S.; Kobilka, T. S.; Choi, H. J.; Kuhn, P.; Weis, W. I.; Kobilka, B. K.; Stevens, R. C. Science 2007, 318, 1258. (58) Cherezov, V.; Yamashita, E.; Liu, W.; Zhalnina, M.; Cramer, W. A.; Caffrey, M. J Mol Biol 2006, 364, 716. (59) Jaakola, V. P.; Griffith, M. T.; Hanson, M. A.; Cherezov, V.; Chien, E. Y. T.; Lane, J. R.; IJzerman, A. P.; Stevens, R. C. Science 2008, 322, 1211. (60) Rosenbaum, D. M.; Cherezov, V.; Hanson, M. A.; Rasmussen, S. G. F.; Thian, F. S.; Kobilka, T. S.; Choi, H. J.; Yao, X. J.; Weis, W. I.; Stevens, R. C.; Kobilka, B. K. Science 2007, 318, 1266. (61) Wacker, D.; Fenalti, G.; Brown, M. A.; Katritch, V.; Abagyan, R.; Cherezov, V.; Stevens, R. C. J Am Chem Soc 2010, 132, 11443. (62) Höfer, N.; Aragão, D.; Caffrey, M. Biophys J 2010, 99, L23. (63) Li, L.; Ismagilov, R. F. Ann Rev Biophys 2010. (64) Pal, R.; Yang, M.; Lin, R.; Johnson, B. N.; Srivastava, N.; Razzacki, S. Z.; Chomistek, K. J.; Heldsinger, D. C.; Haque, R. M.; Ugaz, V. M.; Thwar, P. K.; Chen, Z.; Alfano, K.; Yim, M. B.; Krishnan, M.; Fuller, A. O.; Larson, R. G.; Burke, D. T.; Burns, M. A. Lab Chip 2005, 5, 1024. (65) Jayashree, R. S.; Gancs, L.; Choban, E. R.; Primak, A.; Natarajan, D.; Markoski, L. J.; Kenis, P. J. A. J Am Chem Soc 2005, 127, 16758. (66) Wootton, R. C. R.; deMello, A. J. Chem Commun 2004, 266. (67) McPherson, A. J Appl Crystallogr 2000, 33, 397.