948 resultados para BAYESIAN NETWORKS


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We use Bayesian model selection techniques to test extensions of the standard flat LambdaCDM paradigm. Dark-energy and curvature scenarios, and primordial perturbation models are considered. To that end, we calculate the Bayesian evidence in favour of each model using Population Monte Carlo (PMC), a new adaptive sampling technique which was recently applied in a cosmological context. The Bayesian evidence is immediately available from the PMC sample used for parameter estimation without further computational effort, and it comes with an associated error evaluation. Besides, it provides an unbiased estimator of the evidence after any fixed number of iterations and it is naturally parallelizable, in contrast with MCMC and nested sampling methods. By comparison with analytical predictions for simulated data, we show that our results obtained with PMC are reliable and robust. The variability in the evidence evaluation and the stability for various cases are estimated both from simulations and from data. For the cases we consider, the log-evidence is calculated with a precision of better than 0.08. Using a combined set of recent CMB, SNIa and BAO data, we find inconclusive evidence between flat LambdaCDM and simple dark-energy models. A curved Universe is moderately to strongly disfavoured with respect to a flat cosmology. Using physically well-motivated priors within the slow-roll approximation of inflation, we find a weak preference for a running spectral index. A Harrison-Zel'dovich spectrum is weakly disfavoured. With the current data, tensor modes are not detected; the large prior volume on the tensor-to-scalar ratio r results in moderate evidence in favour of r=0.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this note, we shortly survey some recent approaches on the approximation of the Bayes factor used in Bayesian hypothesis testing and in Bayesian model choice. In particular, we reassess importance sampling, harmonic mean sampling, and nested sampling from a unified perspective.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We review the literature on the combined association between lung cancer and two environmental exposures, asbestos exposure and smoking, and explore a Bayesian approach to assess evidence of interaction between the exposures. The meta-analysis combines separate indices of additive and multiplicative relationships and multivariate relative risk estimates. By making inferences on posterior probabilities we can explore both the form and strength of interaction. This analysis may be more informative than providing evidence to support one relation over another on the basis of statistical significance. Overall, we find evidence for a more than additive and less than multiplicative relation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Pathogens and pests of stored grains move through complex dynamic networks linking fields, farms, and bulk storage facilities. Human transport and other forms of dispersal link the components of this network. A network model for pathogen and pest movement through stored grain systems is a first step toward new sampling and mitigation strategies that utilize information about the network structure. An understanding of network structure can be applied to identifying the key network components for pathogen or pest movement through the system. For example, it may be useful to identify a network node, such as a local grain storage facility, through which grain from a large number of fields will be accumulated and move through the network. This node may be particularly important for sampling and mitigation. In some cases more detailed information about network structure can identify key nodes that link two large sections of the network, such that management at the key nodes will greatly reduce the risk of spread between the two sections. In addition to the spread of particular species of pathogens and pests, we also evaluate the spread of problematic subpopulations, such as subpopulations with pesticide resistance. We present an analysis of stored grain pathogen and pest networks for Australia and the United States.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The current state of the practice in Blackspot Identification (BSI) utilizes safety performance functions based on total crash counts to identify transport system sites with potentially high crash risk. This paper postulates that total crash count variation over a transport network is a result of multiple distinct crash generating processes including geometric characteristics of the road, spatial features of the surrounding environment, and driver behaviour factors. However, these multiple sources are ignored in current modelling methodologies in both trying to explain or predict crash frequencies across sites. Instead, current practice employs models that imply that a single underlying crash generating process exists. The model mis-specification may lead to correlating crashes with the incorrect sources of contributing factors (e.g. concluding a crash is predominately caused by a geometric feature when it is a behavioural issue), which may ultimately lead to inefficient use of public funds and misidentification of true blackspots. This study aims to propose a latent class model consistent with a multiple crash process theory, and to investigate the influence this model has on correctly identifying crash blackspots. We first present the theoretical and corresponding methodological approach in which a Bayesian Latent Class (BLC) model is estimated assuming that crashes arise from two distinct risk generating processes including engineering and unobserved spatial factors. The Bayesian model is used to incorporate prior information about the contribution of each underlying process to the total crash count. The methodology is applied to the state-controlled roads in Queensland, Australia and the results are compared to an Empirical Bayesian Negative Binomial (EB-NB) model. A comparison of goodness of fit measures illustrates significantly improved performance of the proposed model compared to the NB model. The detection of blackspots was also improved when compared to the EB-NB model. In addition, modelling crashes as the result of two fundamentally separate underlying processes reveals more detailed information about unobserved crash causes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Having the ability to work with complex models can be highly beneficial, but the computational cost of doing so is often large. Complex models often have intractable likelihoods, so methods that directly use the likelihood function are infeasible. In these situations, the benefits of working with likelihood-free methods become apparent. Likelihood-free methods, such as parametric Bayesian indirect likelihood that uses the likelihood of an alternative parametric auxiliary model, have been explored throughout the literature as a good alternative when the model of interest is complex. One of these methods is called the synthetic likelihood (SL), which assumes a multivariate normal approximation to the likelihood of a summary statistic of interest. This paper explores the accuracy and computational efficiency of the Bayesian version of the synthetic likelihood (BSL) approach in comparison to a competitor known as approximate Bayesian computation (ABC) and its sensitivity to its tuning parameters and assumptions. We relate BSL to pseudo-marginal methods and propose to use an alternative SL that uses an unbiased estimator of the exact working normal likelihood when the summary statistic has a multivariate normal distribution. Several applications of varying complexity are considered to illustrate the findings of this paper.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this thesis the use of the Bayesian approach to statistical inference in fisheries stock assessment is studied. The work was conducted in collaboration of the Finnish Game and Fisheries Research Institute by using the problem of monitoring and prediction of the juvenile salmon population in the River Tornionjoki as an example application. The River Tornionjoki is the largest salmon river flowing into the Baltic Sea. This thesis tackles the issues of model formulation and model checking as well as computational problems related to Bayesian modelling in the context of fisheries stock assessment. Each article of the thesis provides a novel method either for extracting information from data obtained via a particular type of sampling system or for integrating the information about the fish stock from multiple sources in terms of a population dynamics model. Mark-recapture and removal sampling schemes and a random catch sampling method are covered for the estimation of the population size. In addition, a method for estimating the stock composition of a salmon catch based on DNA samples is also presented. For most of the articles, Markov chain Monte Carlo (MCMC) simulation has been used as a tool to approximate the posterior distribution. Problems arising from the sampling method are also briefly discussed and potential solutions for these problems are proposed. Special emphasis in the discussion is given to the philosophical foundation of the Bayesian approach in the context of fisheries stock assessment. It is argued that the role of subjective prior knowledge needed in practically all parts of a Bayesian model should be recognized and consequently fully utilised in the process of model formulation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Advancements in the analysis techniques have led to a rapid accumulation of biological data in databases. Such data often are in the form of sequences of observations, examples including DNA sequences and amino acid sequences of proteins. The scale and quality of the data give promises of answering various biologically relevant questions in more detail than what has been possible before. For example, one may wish to identify areas in an amino acid sequence, which are important for the function of the corresponding protein, or investigate how characteristics on the level of DNA sequence affect the adaptation of a bacterial species to its environment. Many of the interesting questions are intimately associated with the understanding of the evolutionary relationships among the items under consideration. The aim of this work is to develop novel statistical models and computational techniques to meet with the challenge of deriving meaning from the increasing amounts of data. Our main concern is on modeling the evolutionary relationships based on the observed molecular data. We operate within a Bayesian statistical framework, which allows a probabilistic quantification of the uncertainties related to a particular solution. As the basis of our modeling approach we utilize a partition model, which is used to describe the structure of data by appropriately dividing the data items into clusters of related items. Generalizations and modifications of the partition model are developed and applied to various problems. Large-scale data sets provide also a computational challenge. The models used to describe the data must be realistic enough to capture the essential features of the current modeling task but, at the same time, simple enough to make it possible to carry out the inference in practice. The partition model fulfills these two requirements. The problem-specific features can be taken into account by modifying the prior probability distributions of the model parameters. The computational efficiency stems from the ability to integrate out the parameters of the partition model analytically, which enables the use of efficient stochastic search algorithms.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Genetics, the science of heredity and variation in living organisms, has a central role in medicine, in breeding crops and livestock, and in studying fundamental topics of biological sciences such as evolution and cell functioning. Currently the field of genetics is under a rapid development because of the recent advances in technologies by which molecular data can be obtained from living organisms. In order that most information from such data can be extracted, the analyses need to be carried out using statistical models that are tailored to take account of the particular genetic processes. In this thesis we formulate and analyze Bayesian models for genetic marker data of contemporary individuals. The major focus is on the modeling of the unobserved recent ancestry of the sampled individuals (say, for tens of generations or so), which is carried out by using explicit probabilistic reconstructions of the pedigree structures accompanied by the gene flows at the marker loci. For such a recent history, the recombination process is the major genetic force that shapes the genomes of the individuals, and it is included in the model by assuming that the recombination fractions between the adjacent markers are known. The posterior distribution of the unobserved history of the individuals is studied conditionally on the observed marker data by using a Markov chain Monte Carlo algorithm (MCMC). The example analyses consider estimation of the population structure, relatedness structure (both at the level of whole genomes as well as at each marker separately), and haplotype configurations. For situations where the pedigree structure is partially known, an algorithm to create an initial state for the MCMC algorithm is given. Furthermore, the thesis includes an extension of the model for the recent genetic history to situations where also a quantitative phenotype has been measured from the contemporary individuals. In that case the goal is to identify positions on the genome that affect the observed phenotypic values. This task is carried out within the Bayesian framework, where the number and the relative effects of the quantitative trait loci are treated as random variables whose posterior distribution is studied conditionally on the observed genetic and phenotypic data. In addition, the thesis contains an extension of a widely-used haplotyping method, the PHASE algorithm, to settings where genetic material from several individuals has been pooled together, and the allele frequencies of each pool are determined in a single genotyping.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Elucidating the mechanisms responsible for the patterns of species abundance, diversity, and distribution within and across ecological systems is a fundamental research focus in ecology. Species abundance patterns are shaped in a convoluted way by interplays between inter-/intra-specific interactions, environmental forcing, demographic stochasticity, and dispersal. Comprehensive models and suitable inferential and computational tools for teasing out these different factors are quite limited, even though such tools are critically needed to guide the implementation of management and conservation strategies, the efficacy of which rests on a realistic evaluation of the underlying mechanisms. This is even more so in the prevailing context of concerns over climate change progress and its potential impacts on ecosystems. This thesis utilized the flexible hierarchical Bayesian modelling framework in combination with the computer intensive methods known as Markov chain Monte Carlo, to develop methodologies for identifying and evaluating the factors that control the structure and dynamics of ecological communities. These methodologies were used to analyze data from a range of taxa: macro-moths (Lepidoptera), fish, crustaceans, birds, and rodents. Environmental stochasticity emerged as the most important driver of community dynamics, followed by density dependent regulation; the influence of inter-specific interactions on community-level variances was broadly minor. This thesis contributes to the understanding of the mechanisms underlying the structure and dynamics of ecological communities, by showing directly that environmental fluctuations rather than inter-specific competition dominate the dynamics of several systems. This finding emphasizes the need to better understand how species are affected by the environment and acknowledge species differences in their responses to environmental heterogeneity, if we are to effectively model and predict their dynamics (e.g. for management and conservation purposes). The thesis also proposes a model-based approach to integrating the niche and neutral perspectives on community structure and dynamics, making it possible for the relative importance of each category of factors to be evaluated in light of field data.