6 resultados para 7140-329
em Helda - Digital Repository of University of Helsinki
Resumo:
The focus of this study is on statistical analysis of categorical responses, where the response values are dependent of each other. The most typical example of this kind of dependence is when repeated responses have been obtained from the same study unit. For example, in Paper I, the response of interest is the pneumococcal nasopharengyal carriage (yes/no) on 329 children. For each child, the carriage is measured nine times during the first 18 months of life, and thus repeated respones on each child cannot be assumed independent of each other. In the case of the above example, the interest typically lies in the carriage prevalence, and whether different risk factors affect the prevalence. Regression analysis is the established method for studying the effects of risk factors. In order to make correct inferences from the regression model, the associations between repeated responses need to be taken into account. The analysis of repeated categorical responses typically focus on regression modelling. However, further insights can also be gained by investigating the structure of the association. The central theme in this study is on the development of joint regression and association models. The analysis of repeated, or otherwise clustered, categorical responses is computationally difficult. Likelihood-based inference is often feasible only when the number of repeated responses for each study unit is small. In Paper IV, an algorithm is presented, which substantially facilitates maximum likelihood fitting, especially when the number of repeated responses increase. In addition, a notable result arising from this work is the freely available software for likelihood-based estimation of clustered categorical responses.
Resumo:
The aim of this thesis was to unravel the functional-structural characteristics of root systems of Betula pendula Roth., Picea abies (L.) Karst., and Pinus sylvestris L. in mixed boreal forest stands differing in their developmental stage and site fertility. The root systems of these species had similar structural regularities: horizontally-oriented shallow roots defined the horizontal area of influence, and within this area, each species placed fine roots in the uppermost soil layers, while sinker roots defined the maximum rooting depth. Large radial spread and high ramification of coarse roots, and the high specific root length (SRL) and root length density (RLD) of fine roots indicated the high belowground competitiveness and root plasticity of B. pendula. Smaller radial root spread and sparser branching of coarse roots, and low SRL and RLD of fine roots of the conifers could indicate their more conservative resource use and high association with and dependence on ectomycorrhiza-forming fungi. The vertical fine root distributions of the species were mostly overlapping, implying the possibility for intense belowground competition for nutrients. In each species, conduits tapered and their frequency increased from distal roots to the stem, from the stem to the branches, and to leaf petioles in B. pendula. Conduit tapering was organ-specific in each species violating the assumptions of the general vascular scaling model (WBE). This reflects the hierarchical organization of a tree and differences between organs in the relative importance of transport, safety, and mechanical demands. The applied root model was capable of depicting the mass, length and spread of coarse roots of B. pendula and P. abies, and to the lesser extent in P. sylvestris. The roots did not follow self-similar fractal branching, because the parameter values varied within the root systems. Model parameters indicate differences in rooting behavior, and therefore different ecophysiological adaptations between species.
Resumo:
Gene mapping is a systematic search for genes that affect observable characteristics of an organism. In this thesis we offer computational tools to improve the efficiency of (disease) gene-mapping efforts. In the first part of the thesis we propose an efficient simulation procedure for generating realistic genetical data from isolated populations. Simulated data is useful for evaluating hypothesised gene-mapping study designs and computational analysis tools. As an example of such evaluation, we demonstrate how a population-based study design can be a powerful alternative to traditional family-based designs in association-based gene-mapping projects. In the second part of the thesis we consider a prioritisation of a (typically large) set of putative disease-associated genes acquired from an initial gene-mapping analysis. Prioritisation is necessary to be able to focus on the most promising candidates. We show how to harness the current biomedical knowledge for the prioritisation task by integrating various publicly available biological databases into a weighted biological graph. We then demonstrate how to find and evaluate connections between entities, such as genes and diseases, from this unified schema by graph mining techniques. Finally, in the last part of the thesis, we define the concept of reliable subgraph and the corresponding subgraph extraction problem. Reliable subgraphs concisely describe strong and independent connections between two given vertices in a random graph, and hence they are especially useful for visualising such connections. We propose novel algorithms for extracting reliable subgraphs from large random graphs. The efficiency and scalability of the proposed graph mining methods are backed by extensive experiments on real data. While our application focus is in genetics, the concepts and algorithms can be applied to other domains as well. We demonstrate this generality by considering coauthor graphs in addition to biological graphs in the experiments.