36 resultados para Simulation tools
em Helda - Digital Repository of University of Helsinki
Resumo:
Gene mapping is a systematic search for genes that affect observable characteristics of an organism. In this thesis we offer computational tools to improve the efficiency of (disease) gene-mapping efforts. In the first part of the thesis we propose an efficient simulation procedure for generating realistic genetical data from isolated populations. Simulated data is useful for evaluating hypothesised gene-mapping study designs and computational analysis tools. As an example of such evaluation, we demonstrate how a population-based study design can be a powerful alternative to traditional family-based designs in association-based gene-mapping projects. In the second part of the thesis we consider a prioritisation of a (typically large) set of putative disease-associated genes acquired from an initial gene-mapping analysis. Prioritisation is necessary to be able to focus on the most promising candidates. We show how to harness the current biomedical knowledge for the prioritisation task by integrating various publicly available biological databases into a weighted biological graph. We then demonstrate how to find and evaluate connections between entities, such as genes and diseases, from this unified schema by graph mining techniques. Finally, in the last part of the thesis, we define the concept of reliable subgraph and the corresponding subgraph extraction problem. Reliable subgraphs concisely describe strong and independent connections between two given vertices in a random graph, and hence they are especially useful for visualising such connections. We propose novel algorithms for extracting reliable subgraphs from large random graphs. The efficiency and scalability of the proposed graph mining methods are backed by extensive experiments on real data. While our application focus is in genetics, the concepts and algorithms can be applied to other domains as well. We demonstrate this generality by considering coauthor graphs in addition to biological graphs in the experiments.
Resumo:
XVIII IUFRO World Congress, Ljubljana 1986.
Resumo:
This thesis presents an interdisciplinary analysis of how models and simulations function in the production of scientific knowledge. The work is informed by three scholarly traditions: studies on models and simulations in philosophy of science, so-called micro-sociological laboratory studies within science and technology studies, and cultural-historical activity theory. Methodologically, I adopt a naturalist epistemology and combine philosophical analysis with a qualitative, empirical case study of infectious-disease modelling. This study has a dual perspective throughout the analysis: it specifies the modelling practices and examines the models as objects of research. The research questions addressed in this study are: 1) How are models constructed and what functions do they have in the production of scientific knowledge? 2) What is interdisciplinarity in model construction? 3) How do models become a general research tool and why is this process problematic? The core argument is that the mediating models as investigative instruments (cf. Morgan and Morrison 1999) take questions as a starting point, and hence their construction is intentionally guided. This argument applies the interrogative model of inquiry (e.g., Sintonen 2005; Hintikka 1981), which conceives of all knowledge acquisition as process of seeking answers to questions. The first question addresses simulation models as Artificial Nature, which is manipulated in order to answer questions that initiated the model building. This account develops further the "epistemology of simulation" (cf. Winsberg 2003) by showing the interrelatedness of researchers and their objects in the process of modelling. The second question clarifies why interdisciplinary research collaboration is demanding and difficult to maintain. The nature of the impediments to disciplinary interaction are examined by introducing the idea of object-oriented interdisciplinarity, which provides an analytical framework to study the changes in the degree of interdisciplinarity, the tools and research practices developed to support the collaboration, and the mode of collaboration in relation to the historically mutable object of research. As my interest is in the models as interdisciplinary objects, the third research problem seeks to answer my question of how we might characterise these objects, what is typical for them, and what kind of changes happen in the process of modelling. Here I examine the tension between specified, question-oriented models and more general models, and suggest that the specified models form a group of their own. I call these Tailor-made models, in opposition to the process of building a simulation platform that aims at generalisability and utility for health-policy. This tension also underlines the challenge of applying research results (or methods and tools) to discuss and solve problems in decision-making processes.
Resumo:
Forest management is facing new challenges under climate change. By adjusting thinning regimes, conventional forest management can be adapted to various objectives of utilization of forest resources, such as wood quality, forest bioenergy, and carbon sequestration. This thesis aims to develop and apply a simulation-optimization system as a tool for an interdisciplinary understanding of the interactions between wood science, forest ecology, and forest economics. In this thesis, the OptiFor software was developed for forest resources management. The OptiFor simulation-optimization system integrated the process-based growth model PipeQual, wood quality models, biomass production and carbon emission models, as well as energy wood and commercial logging models into a single optimization model. Osyczka s direct and random search algorithm was employed to identify optimal values for a set of decision variables. The numerical studies in this thesis broadened our current knowledge and understanding of the relationships between wood science, forest ecology, and forest economics. The results for timber production show that optimal thinning regimes depend on site quality and initial stand characteristics. Taking wood properties into account, our results show that increasing the intensity of thinning resulted in lower wood density and shorter fibers. The addition of nutrients accelerated volume growth, but lowered wood quality for Norway spruce. Integrating energy wood harvesting into conventional forest management showed that conventional forest management without energy wood harvesting was still superior in sparse stands of Scots pine. Energy wood from pre-commercial thinning turned out to be optimal for dense stands. When carbon balance is taken into account, our results show that changing carbon assessment methods leads to very different optimal thinning regimes and average carbon stocks. Raising the carbon price resulted in longer rotations and a higher mean annual increment, as well as a significantly higher average carbon stock over the rotation.
Resumo:
This thesis studies human gene expression space using high throughput gene expression data from DNA microarrays. In molecular biology, high throughput techniques allow numerical measurements of expression of tens of thousands of genes simultaneously. In a single study, this data is traditionally obtained from a limited number of sample types with a small number of replicates. For organism-wide analysis, this data has been largely unavailable and the global structure of human transcriptome has remained unknown. This thesis introduces a human transcriptome map of different biological entities and analysis of its general structure. The map is constructed from gene expression data from the two largest public microarray data repositories, GEO and ArrayExpress. The creation of this map contributed to the development of ArrayExpress by identifying and retrofitting the previously unusable and missing data and by improving the access to its data. It also contributed to creation of several new tools for microarray data manipulation and establishment of data exchange between GEO and ArrayExpress. The data integration for the global map required creation of a new large ontology of human cell types, disease states, organism parts and cell lines. The ontology was used in a new text mining and decision tree based method for automatic conversion of human readable free text microarray data annotations into categorised format. The data comparability and minimisation of the systematic measurement errors that are characteristic to each lab- oratory in this large cross-laboratories integrated dataset, was ensured by computation of a range of microarray data quality metrics and exclusion of incomparable data. The structure of a global map of human gene expression was then explored by principal component analysis and hierarchical clustering using heuristics and help from another purpose built sample ontology. A preface and motivation to the construction and analysis of a global map of human gene expression is given by analysis of two microarray datasets of human malignant melanoma. The analysis of these sets incorporate indirect comparison of statistical methods for finding differentially expressed genes and point to the need to study gene expression on a global level.
Resumo:
Ubiquitous computing is about making computers and computerized artefacts a pervasive part of our everyday lifes, bringing more and more activities into the realm of information. The computationalization, informationalization of everyday activities increases not only our reach, efficiency and capabilities but also the amount and kinds of data gathered about us and our activities. In this thesis, I explore how information systems can be constructed so that they handle this personal data in a reasonable manner. The thesis provides two kinds of results: on one hand, tools and methods for both the construction as well as the evaluation of ubiquitous and mobile systems---on the other hand an evaluation of the privacy aspects of a ubiquitous social awareness system. The work emphasises real-world experiments as the most important way to study privacy. Additionally, the state of current information systems as regards data protection is studied. The tools and methods in this thesis consist of three distinct contributions. An algorithm for locationing in cellular networks is proposed that does not require the location information to be revealed beyond the user's terminal. A prototyping platform for the creation of context-aware ubiquitous applications called ContextPhone is described and released as open source. Finally, a set of methodological findings for the use of smartphones in social scientific field research is reported. A central contribution of this thesis are the pragmatic tools that allow other researchers to carry out experiments. The evaluation of the ubiquitous social awareness application ContextContacts covers both the usage of the system in general as well as an analysis of privacy implications. The usage of the system is analyzed in the light of how users make inferences of others based on real-time contextual cues mediated by the system, based on several long-term field studies. The analysis of privacy implications draws together the social psychological theory of self-presentation and research in privacy for ubiquitous computing, deriving a set of design guidelines for such systems. The main findings from these studies can be summarized as follows: The fact that ubiquitous computing systems gather more data about users can be used to not only study the use of such systems in an effort to create better systems but in general to study phenomena previously unstudied, such as the dynamic change of social networks. Systems that let people create new ways of presenting themselves to others can be fun for the users---but the self-presentation requires several thoughtful design decisions that allow the manipulation of the image mediated by the system. Finally, the growing amount of computational resources available to the users can be used to allow them to use the data themselves, rather than just being passive subjects of data gathering.