22 resultados para Approximate Bayesian computation

em Helda - Digital Repository of University of Helsinki


Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this thesis the use of the Bayesian approach to statistical inference in fisheries stock assessment is studied. The work was conducted in collaboration of the Finnish Game and Fisheries Research Institute by using the problem of monitoring and prediction of the juvenile salmon population in the River Tornionjoki as an example application. The River Tornionjoki is the largest salmon river flowing into the Baltic Sea. This thesis tackles the issues of model formulation and model checking as well as computational problems related to Bayesian modelling in the context of fisheries stock assessment. Each article of the thesis provides a novel method either for extracting information from data obtained via a particular type of sampling system or for integrating the information about the fish stock from multiple sources in terms of a population dynamics model. Mark-recapture and removal sampling schemes and a random catch sampling method are covered for the estimation of the population size. In addition, a method for estimating the stock composition of a salmon catch based on DNA samples is also presented. For most of the articles, Markov chain Monte Carlo (MCMC) simulation has been used as a tool to approximate the posterior distribution. Problems arising from the sampling method are also briefly discussed and potential solutions for these problems are proposed. Special emphasis in the discussion is given to the philosophical foundation of the Bayesian approach in the context of fisheries stock assessment. It is argued that the role of subjective prior knowledge needed in practically all parts of a Bayesian model should be recognized and consequently fully utilised in the process of model formulation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

There exists various suggestions for building a functional and a fault-tolerant large-scale quantum computer. Topological quantum computation is a more exotic suggestion, which makes use of the properties of quasiparticles manifest only in certain two-dimensional systems. These so called anyons exhibit topological degrees of freedom, which, in principle, can be used to execute quantum computation with intrinsic fault-tolerance. This feature is the main incentive to study topological quantum computation. The objective of this thesis is to provide an accessible introduction to the theory. In this thesis one has considered the theory of anyons arising in two-dimensional quantum mechanical systems, which are described by gauge theories based on so called quantum double symmetries. The quasiparticles are shown to exhibit interactions and carry quantum numbers, which are both of topological nature. Particularly, it is found that the addition of the quantum numbers is not unique, but that the fusion of the quasiparticles is described by a non-trivial fusion algebra. It is discussed how this property can be used to encode quantum information in a manner which is intrinsically protected from decoherence and how one could, in principle, perform quantum computation by braiding the quasiparticles. As an example of the presented general discussion, the particle spectrum and the fusion algebra of an anyon model based on the gauge group S_3 are explicitly derived. The fusion algebra is found to branch into multiple proper subalgebras and the simplest one of them is chosen as a model for an illustrative demonstration. The different steps of a topological quantum computation are outlined and the computational power of the model is assessed. It turns out that the chosen model is not universal for quantum computation. However, because the objective was a demonstration of the theory with explicit calculations, none of the other more complicated fusion subalgebras were considered. Studying their applicability for quantum computation could be a topic of further research.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Advancements in the analysis techniques have led to a rapid accumulation of biological data in databases. Such data often are in the form of sequences of observations, examples including DNA sequences and amino acid sequences of proteins. The scale and quality of the data give promises of answering various biologically relevant questions in more detail than what has been possible before. For example, one may wish to identify areas in an amino acid sequence, which are important for the function of the corresponding protein, or investigate how characteristics on the level of DNA sequence affect the adaptation of a bacterial species to its environment. Many of the interesting questions are intimately associated with the understanding of the evolutionary relationships among the items under consideration. The aim of this work is to develop novel statistical models and computational techniques to meet with the challenge of deriving meaning from the increasing amounts of data. Our main concern is on modeling the evolutionary relationships based on the observed molecular data. We operate within a Bayesian statistical framework, which allows a probabilistic quantification of the uncertainties related to a particular solution. As the basis of our modeling approach we utilize a partition model, which is used to describe the structure of data by appropriately dividing the data items into clusters of related items. Generalizations and modifications of the partition model are developed and applied to various problems. Large-scale data sets provide also a computational challenge. The models used to describe the data must be realistic enough to capture the essential features of the current modeling task but, at the same time, simple enough to make it possible to carry out the inference in practice. The partition model fulfills these two requirements. The problem-specific features can be taken into account by modifying the prior probability distributions of the model parameters. The computational efficiency stems from the ability to integrate out the parameters of the partition model analytically, which enables the use of efficient stochastic search algorithms.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Genetics, the science of heredity and variation in living organisms, has a central role in medicine, in breeding crops and livestock, and in studying fundamental topics of biological sciences such as evolution and cell functioning. Currently the field of genetics is under a rapid development because of the recent advances in technologies by which molecular data can be obtained from living organisms. In order that most information from such data can be extracted, the analyses need to be carried out using statistical models that are tailored to take account of the particular genetic processes. In this thesis we formulate and analyze Bayesian models for genetic marker data of contemporary individuals. The major focus is on the modeling of the unobserved recent ancestry of the sampled individuals (say, for tens of generations or so), which is carried out by using explicit probabilistic reconstructions of the pedigree structures accompanied by the gene flows at the marker loci. For such a recent history, the recombination process is the major genetic force that shapes the genomes of the individuals, and it is included in the model by assuming that the recombination fractions between the adjacent markers are known. The posterior distribution of the unobserved history of the individuals is studied conditionally on the observed marker data by using a Markov chain Monte Carlo algorithm (MCMC). The example analyses consider estimation of the population structure, relatedness structure (both at the level of whole genomes as well as at each marker separately), and haplotype configurations. For situations where the pedigree structure is partially known, an algorithm to create an initial state for the MCMC algorithm is given. Furthermore, the thesis includes an extension of the model for the recent genetic history to situations where also a quantitative phenotype has been measured from the contemporary individuals. In that case the goal is to identify positions on the genome that affect the observed phenotypic values. This task is carried out within the Bayesian framework, where the number and the relative effects of the quantitative trait loci are treated as random variables whose posterior distribution is studied conditionally on the observed genetic and phenotypic data. In addition, the thesis contains an extension of a widely-used haplotyping method, the PHASE algorithm, to settings where genetic material from several individuals has been pooled together, and the allele frequencies of each pool are determined in a single genotyping.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We solve the Dynamic Ehrenfeucht-Fra\"iss\'e Game on linear orders for both players, yielding a normal form for quantifier-rank equivalence classes of linear orders in first-order logic, infinitary logic, and generalized-infinitary logics with linearly ordered clocks. We show that Scott Sentences can be manipulated quickly, classified into local information, and consistency can be decided effectively in the length of the Scott Sentence. We describe a finite set of linked automata moving continuously on a linear order. Running them on ordinals, we compute the ordinal truth predicate and compute truth in the constructible universe of set-theory. Among the corollaries are a study of semi-models as efficient database of both model-theoretic and formulaic information, and a new proof of the atomicity of the Boolean algebra of sentences consistent with the theory of linear order -- i.e., that the finitely axiomatized theories of linear order are dense.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Elucidating the mechanisms responsible for the patterns of species abundance, diversity, and distribution within and across ecological systems is a fundamental research focus in ecology. Species abundance patterns are shaped in a convoluted way by interplays between inter-/intra-specific interactions, environmental forcing, demographic stochasticity, and dispersal. Comprehensive models and suitable inferential and computational tools for teasing out these different factors are quite limited, even though such tools are critically needed to guide the implementation of management and conservation strategies, the efficacy of which rests on a realistic evaluation of the underlying mechanisms. This is even more so in the prevailing context of concerns over climate change progress and its potential impacts on ecosystems. This thesis utilized the flexible hierarchical Bayesian modelling framework in combination with the computer intensive methods known as Markov chain Monte Carlo, to develop methodologies for identifying and evaluating the factors that control the structure and dynamics of ecological communities. These methodologies were used to analyze data from a range of taxa: macro-moths (Lepidoptera), fish, crustaceans, birds, and rodents. Environmental stochasticity emerged as the most important driver of community dynamics, followed by density dependent regulation; the influence of inter-specific interactions on community-level variances was broadly minor. This thesis contributes to the understanding of the mechanisms underlying the structure and dynamics of ecological communities, by showing directly that environmental fluctuations rather than inter-specific competition dominate the dynamics of several systems. This finding emphasizes the need to better understand how species are affected by the environment and acknowledge species differences in their responses to environmental heterogeneity, if we are to effectively model and predict their dynamics (e.g. for management and conservation purposes). The thesis also proposes a model-based approach to integrating the niche and neutral perspectives on community structure and dynamics, making it possible for the relative importance of each category of factors to be evaluated in light of field data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Bacteria play an important role in many ecological systems. The molecular characterization of bacteria using either cultivation-dependent or cultivation-independent methods reveals the large scale of bacterial diversity in natural communities, and the vastness of subpopulations within a species or genus. Understanding how bacterial diversity varies across different environments and also within populations should provide insights into many important questions of bacterial evolution and population dynamics. This thesis presents novel statistical methods for analyzing bacterial diversity using widely employed molecular fingerprinting techniques. The first objective of this thesis was to develop Bayesian clustering models to identify bacterial population structures. Bacterial isolates were identified using multilous sequence typing (MLST), and Bayesian clustering models were used to explore the evolutionary relationships among isolates. Our method involves the inference of genetic population structures via an unsupervised clustering framework where the dependence between loci is represented using graphical models. The population dynamics that generate such a population stratification were investigated using a stochastic model, in which homologous recombination between subpopulations can be quantified within a gene flow network. The second part of the thesis focuses on cluster analysis of community compositional data produced by two different cultivation-independent analyses: terminal restriction fragment length polymorphism (T-RFLP) analysis, and fatty acid methyl ester (FAME) analysis. The cluster analysis aims to group bacterial communities that are similar in composition, which is an important step for understanding the overall influences of environmental and ecological perturbations on bacterial diversity. A common feature of T-RFLP and FAME data is zero-inflation, which indicates that the observation of a zero value is much more frequent than would be expected, for example, from a Poisson distribution in the discrete case, or a Gaussian distribution in the continuous case. We provided two strategies for modeling zero-inflation in the clustering framework, which were validated by both synthetic and empirical complex data sets. We show in the thesis that our model that takes into account dependencies between loci in MLST data can produce better clustering results than those methods which assume independent loci. Furthermore, computer algorithms that are efficient in analyzing large scale data were adopted for meeting the increasing computational need. Our method that detects homologous recombination in subpopulations may provide a theoretical criterion for defining bacterial species. The clustering of bacterial community data include T-RFLP and FAME provides an initial effort for discovering the evolutionary dynamics that structure and maintain bacterial diversity in the natural environment.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The module of a quadrilateral is a positive real number which divides quadrilaterals into conformal equivalence classes. This is an introductory text to the module of a quadrilateral with some historical background and some numerical aspects. This work discusses the following topics: 1. Preliminaries 2. The module of a quadrilateral 3. The Schwarz-Christoffel Mapping 4. Symmetry properties of the module 5. Computational results 6. Other numerical methods Appendices include: Numerical evaluation of the elliptic integrals of the first kind. Matlab programs and scripts and possible topics for future research. Numerical results section covers additive quadrilaterals and the module of a quadrilateral under the movement of one of its vertex.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This doctoral dissertation introduces an algorithm for constructing the most probable Bayesian network from data for small domains. The algorithm is used to show that a popular goodness criterion for the Bayesian networks has a severe sensitivity problem. The dissertation then proposes an information theoretic criterion that avoids the problem.

Relevância:

20.00% 20.00%

Publicador: