5 resultados para Information Gene

em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The amount of biological data has grown exponentially in recent decades. Modern biotechnologies, such as microarrays and next-generation sequencing, are capable to produce massive amounts of biomedical data in a single experiment. As the amount of the data is rapidly growing there is an urgent need for reliable computational methods for analyzing and visualizing it. This thesis addresses this need by studying how to efficiently and reliably analyze and visualize high-dimensional data, especially that obtained from gene expression microarray experiments. First, we will study the ways to improve the quality of microarray data by replacing (imputing) the missing data entries with the estimated values for these entries. Missing value imputation is a method which is commonly used to make the original incomplete data complete, thus making it easier to be analyzed with statistical and computational methods. Our novel approach was to use curated external biological information as a guide for the missing value imputation. Secondly, we studied the effect of missing value imputation on the downstream data analysis methods like clustering. We compared multiple recent imputation algorithms against 8 publicly available microarray data sets. It was observed that the missing value imputation indeed is a rational way to improve the quality of biological data. The research revealed differences between the clustering results obtained with different imputation methods. On most data sets, the simple and fast k-NN imputation was good enough, but there were also needs for more advanced imputation methods, such as Bayesian Principal Component Algorithm (BPCA). Finally, we studied the visualization of biological network data. Biological interaction networks are examples of the outcome of multiple biological experiments such as using the gene microarray techniques. Such networks are typically very large and highly connected, thus there is a need for fast algorithms for producing visually pleasant layouts. A computationally efficient way to produce layouts of large biological interaction networks was developed. The algorithm uses multilevel optimization within the regular force directed graph layout algorithm.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The human immune system is constantly interacting with the surrounding stimuli and microorganisms. However, when directed against self or harmless antigens, these vital defense mechanisms can cause great damage. In addition, the understanding the underlying mechanism of several human diseases caused by aberrant immune cell functions, for instance type 1 diabetes and allergies, remains far from being complete. In this Ph.D. study these questions were addressed using genome-wide transcriptomic analyses. Asthma and allergies are characterized by a hyperactive response of the T helper 2 (Th2) immune cells. In this study, the target genes of the STAT6 transcription factor in naïve human T cells were identified with RNAi for the first time. STAT6 was shown to act as a central activator of the genes expression upon IL-4 signaling, with both direct and indirect effects on Th2 cell transcriptome. The core transcription factor network induced by IL-4 was identified from a kinetic analysis of the transcriptome. Type 1 diabetes is an autoimmune disease influenced by both the genetic susceptibility of an individual and the disease-triggering environmental factors. To improve understanding of the autoimmune processes driving pathogenesis in the prediabetic phase in humans, a unique series of prospective whole-blood RNA samples collected from HLA-susceptible children in the Finnish Type 1 Diabetes Prediction and Prevention (DIPP) study was studied. Changes in different timewindows of the pathogenesis process were identified, and especially the type 1 interferon response was activated early and throughout the preclinical T1D. The hygiene hypothesis states that allergic diseases, and lately also autoimmune diseases, could be prevented by infections and other microbial contacts acquired in early childhood, or even prenatally. To study the effects of the standard of hygiene on the development of neonatal immune system, cord blood samples from children born in Finland (high standard of living), Estonia (rapid economic growth) and Russian Karelia (low standard of living) were compared. Children born in Russian Karelia deviated from Finnish and Estonian children in many aspects of the neonatal immune system, which was developmentally more mature in Karelia, resembling that of older infants. The results of this thesis offer significant new information on the regulatory networks associated with immune-mediated diseases in human. The results will facilitate understanding and further research on the role of the identified target genes and mechanisms driving the allergic inflammation and type 1 diabetes, hopefully leading to a new era of drug development.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The advancement of science and technology makes it clear that no single perspective is any longer sufficient to describe the true nature of any phenomenon. That is why the interdisciplinary research is gaining more attention overtime. An excellent example of this type of research is natural computing which stands on the borderline between biology and computer science. The contribution of research done in natural computing is twofold: on one hand, it sheds light into how nature works and how it processes information and, on the other hand, it provides some guidelines on how to design bio-inspired technologies. The first direction in this thesis focuses on a nature-inspired process called gene assembly in ciliates. The second one studies reaction systems, as a modeling framework with its rationale built upon the biochemical interactions happening within a cell. The process of gene assembly in ciliates has attracted a lot of attention as a research topic in the past 15 years. Two main modelling frameworks have been initially proposed in the end of 1990s to capture ciliates’ gene assembly process, namely the intermolecular model and the intramolecular model. They were followed by other model proposals such as templatebased assembly and DNA rearrangement pathways recombination models. In this thesis we are interested in a variation of the intramolecular model called simple gene assembly model, which focuses on the simplest possible folds in the assembly process. We propose a new framework called directed overlap-inclusion (DOI) graphs to overcome the limitations that previously introduced models faced in capturing all the combinatorial details of the simple gene assembly process. We investigate a number of combinatorial properties of these graphs, including a necessary property in terms of forbidden induced subgraphs. We also introduce DOI graph-based rewriting rules that capture all the operations of the simple gene assembly model and prove that they are equivalent to the string-based formalization of the model. Reaction systems (RS) is another nature-inspired modeling framework that is studied in this thesis. Reaction systems’ rationale is based upon two main regulation mechanisms, facilitation and inhibition, which control the interactions between biochemical reactions. Reaction systems is a complementary modeling framework to traditional quantitative frameworks, focusing on explicit cause-effect relationships between reactions. The explicit formulation of facilitation and inhibition mechanisms behind reactions, as well as the focus on interactions between reactions (rather than dynamics of concentrations) makes their applicability potentially wide and useful beyond biological case studies. In this thesis, we construct a reaction system model corresponding to the heat shock response mechanism based on a novel concept of dominance graph that captures the competition on resources in the ODE model. We also introduce for RS various concepts inspired by biology, e.g., mass conservation, steady state, periodicity, etc., to do model checking of the reaction systems based models. We prove that the complexity of the decision problems related to these properties varies from P to NP- and coNP-complete to PSPACE-complete. We further focus on the mass conservation relation in an RS and introduce the conservation dependency graph to capture the relation between the species and also propose an algorithm to list the conserved sets of a given reaction system.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Lichens are symbiotic organisms, which consist of the fungal partner and the photosynthetic partner, which can be either an alga or a cyanobacterium. In some lichen species the symbiosis is tripartite, where the relationship includes both an alga and a cyanobacterium alongside the primary symbiont, fungus. The lichen symbiosis is an evolutionarily old adaptation to life on land and many extant fungal species have evolved from lichenised ancestors. Lichens inhabit a wide range of habitats and are capable of living in harsh environments and on nutrient poor substrates, such as bare rocks, often enduring frequent cycles of drying and wetting. Most lichen species are desiccation tolerant, and they can survive long periods of dehydration, but can rapidly resume photosynthesis upon rehydration. The molecular mechanisms behind lichen desiccation tolerance are still largely uncharacterised and little information is available for any lichen species at the genomic or transcriptomic level. The emergence of the high-throughput next generation sequencing (NGS) technologies and the subsequent decrease in the cost of sequencing new genomes and transcriptomes has enabled non-model organism research on the whole genome level. In this doctoral work the transcriptome and genome of the grey reindeer lichen, Cladonia rangiferina, were sequenced, de novo assembled and characterised using NGS and traditional expressed sequence tag (EST) technologies. RNA extraction methods were optimised to improve the yield and quality of RNA extracted from lichen tissue. The effects of rehydration and desiccation on C. rangiferina gene expression on whole transcriptome level were studied and the most differentially expressed genes were identified. The secondary metabolites present in C. rangiferina decreased the quality – integrity, optical characteristics and utility for sensitive molecular biological applications – of the extracted RNA requiring an optimised RNA extraction method for isolating sufficient quantities of high-quality RNA from lichen tissue in a time- and cost-efficient manner. The de novo assembly of the transcriptome of C. rangiferina was used to produce a set of contiguous unigene sequences that were used to investigate the biological functions and pathways active in a hydrated lichen thallus. The de novo assembly of the genome yielded an assembly containing mostly genes derived from the fungal partner. The assembly was of sufficient quality, in size similar to other lichen-forming fungal genomes and included most of the core eukaryotic genes. Differences in gene expression were detected in all studied stages of desiccation and rehydration, but the largest changes occurred during the early stages of rehydration. The most differentially expressed genes did not have any annotations, making them potentially lichen-specific genes, but several genes known to participate in environmental stress tolerance in other organisms were also identified as differentially expressed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

There is an increasing demand for individualized, genotype-based health advice. The general population-based dietary recommendations do not always motivate people to change their life-style, and partly following this, cardiovascular diseases (CVD) are a major cause of death in worldwide. Using genotype-based nutrition and health information (e.g. nutrigenetics) in health education is a relatively new approach, although genetic variation is known to cause individual differences in response to dietary factors. Response to changes in dietary fat quality varies, for example, among different APOE genotypes. Research in this field is challenging, because several non-modifiable (genetic, age, sex) and modifiable (e.g. lifestyle, dietary, physical activity) factors together and with interaction affect the risk of life-style related diseases (e.g. CVD). The other challenge is the psychological factors (e.g. anxiety, threat, stress, motivation, attitude), which also have an effect on health behavior. The genotype-based information is always a very sensitive topic, because it can also cause some negative consequences and feelings (e.g. depression, increased anxiety). The aim of this series of studies was firstly to study how individual, genotype-based health information affects an individual’s health form three aspects, and secondly whether this could be one method in the future to prevent lifestyle-related diseases, such as CVD. The first study concentrated on the psychological effects; the focus of the second study was on health behavior effects, and the third study concentrated on clinical effects. In the fourth study of this series, the focus was on all these three aspects and their associations with each other. The genetic risk and health information was the APOE gene and its effects on CVD. To study the effect of APOE genotype-based health information in prevention of CVD, a total of 151 volunteers attended the baseline assessments (T0), of which 122 healthy adults (aged 20 – 67 y) passed the inclusion criteria and started the one-year intervention. The participants (n = 122) were randomized into a control group (n = 61) and an intervention group (n = 61). There were 21 participants in the intervention Ɛ4+ group (including APOE genotypes 3/4 and 4/4) and 40 participants in the intervention Ɛ4- group (including APOE genotypes 2/3 and 3/3). The control group included 61 participants (including APOE genotypes 3/4, 4/4, 2/3, 3/3 and 2/2). The baseline (T0) and follow-up assessments (T1, T2, T3) included detailed measurements of psychological (threat and anxiety experience, stage of change), and behavioral (dietary fat quality, consumption of vegetables, - high fat/sugar foods and –alcohol, physical activity and health and taste attitudes) and clinical factors (total-, LDL- HDL cholesterol, triglycerides, blood pressure, blood glucose (0h and 2h), body mass index, waist circumference and body fat percentage). During the intervention six different communication sessions (lectures on healthy lifestyle and nutrigenomics, health messages by mail, and personal discussion with the doctor) were arranged. The intervention groups (Ɛ4+ and Ɛ4-) received their APOE genotype information and health message at the beginning of the intervention. The control group received their APOE genotype information after the intervention. For the analyses in this dissertation, the results for 106/107 participants were analyzed. In the intervention, there were 16 participants in the high-risk (Ɛ4+) group and 35 in the low-risk (Ɛ4-) group. The control group had 55 participants in studies III-IV and 56 participants in studies I-II. The intervention had both short-term (≤ 6 months) and long-term (12 months) effects on health behavior and clinical factors. The short-term effects were found in dietary fat quality and waist circumference. Dietary fat quality improved more in the Ɛ4+ group than the Ɛ4- and the control groups as the personal, genotype-based health information and waist circumference lowered more in the Ɛ4+ group compared with the control group. Both these changes differed significantly between the Ɛ4+ and control groups (p<0.05). A long-term effect was found in triglyceride values (p<0.05), which lowered more in Ɛ4+ compared with the control group during the intervention. Short-term effects were also found in the threat experience, which increased mostly in the Ɛ4+ group after the genetic feedback (p<0.05), but it decreased after 12 months, although remaining at a higher level compared to the baseline (T0). In addition, Study IV found that changes in the psychological factors (anxiety and threat experience, motivation), health and taste attitudes, and health behaviors (dietary, alcohol consumption, and physical activity) did not directly explain the changes in triglyceride values and waist circumference. However, change caused by a threat experience may have affected the change in triglycerides through total- and HDL cholesterol. In conclusion, this dissertation study has given some indications that individual, genotypebased health information could be one potential option in the future to prevent lifestyle-related diseases in public health care. The results of this study imply that personal genetic information, based on APOE, may have positive effects on dietary fat quality and some cardiovascular risk markers (e.g., improvement in triglyceride values and waist circumference). This study also suggests that psychological factors (e.g. anxiety and threat experience) may not be an obstacle for healthy people to use genotype-based health information to promote healthy lifestyles. However, even in the case of very personal health information, in order to achieve a permanent health behavior change, it is important to include attitudes and other psychological factors (e.g. motivation), as well as intensive repetition and a longer intervention duration. This research will serve as a basis for future studies and its information can be used to develop targeted interventions, including health information based on genotyping that would aim at preventing lifestyle diseases. People’s interest in personalized health advices has increased, while also the costs of genetic screening have decreased. Therefore, generally speaking, it can be assumed that genetic screening as a part of the prevention of lifestyle-related diseases may become more common in the future. In consequence, more research is required about how to make genetic screening a practical tool in public health care, and how to efficiently achieve long-term changes.