355 resultados para Bayesian probability
Resumo:
A fundamental problem in the analysis of structured relational data like graphs, networks, databases, and matrices is to extract a summary of the common structure underlying relations between individual entities. Relational data are typically encoded in the form of arrays; invariance to the ordering of rows and columns corresponds to exchangeable arrays. Results in probability theory due to Aldous, Hoover and Kallenberg show that exchangeable arrays can be represented in terms of a random measurable function which constitutes the natural model parameter in a Bayesian model. We obtain a flexible yet simple Bayesian nonparametric model by placing a Gaussian process prior on the parameter function. Efficient inference utilises elliptical slice sampling combined with a random sparse approximation to the Gaussian process. We demonstrate applications of the model to network data and clarify its relation to models in the literature, several of which emerge as special cases.
Resumo:
We live in an era of abundant data. This has necessitated the development of new and innovative statistical algorithms to get the most from experimental data. For example, faster algorithms make practical the analysis of larger genomic data sets, allowing us to extend the utility of cutting-edge statistical methods. We present a randomised algorithm that accelerates the clustering of time series data using the Bayesian Hierarchical Clustering (BHC) statistical method. BHC is a general method for clustering any discretely sampled time series data. In this paper we focus on a particular application to microarray gene expression data. We define and analyse the randomised algorithm, before presenting results on both synthetic and real biological data sets. We show that the randomised algorithm leads to substantial gains in speed with minimal loss in clustering quality. The randomised time series BHC algorithm is available as part of the R package BHC, which is available for download from Bioconductor (version 2.10 and above) via http://bioconductor.org/packages/2.10/bioc/html/BHC.html. We have also made available a set of R scripts which can be used to reproduce the analyses carried out in this paper. These are available from the following URL. https://sites.google.com/site/randomisedbhc/.
Resumo:
We consider a method for approximate inference in hidden Markov models (HMMs). The method circumvents the need to evaluate conditional densities of observations given the hidden states. It may be considered an instance of Approximate Bayesian Computation (ABC) and it involves the introduction of auxiliary variables valued in the same space as the observations. The quality of the approximation may be controlled to arbitrary precision through a parameter ε > 0. We provide theoretical results which quantify, in terms of ε, the ABC error in approximation of expectations of additive functionals with respect to the smoothing distributions. Under regularity assumptions, this error is, where n is the number of time steps over which smoothing is performed. For numerical implementation, we adopt the forward-only sequential Monte Carlo (SMC) scheme of [14] and quantify the combined error from the ABC and SMC approximations. This forms some of the first quantitative results for ABC methods which jointly treat the ABC and simulation errors, with a finite number of data and simulated samples. © Taylor & Francis Group, LLC.
Resumo:
Numerical integration is a key component of many problems in scientific computing, statistical modelling, and machine learning. Bayesian Quadrature is a modelbased method for numerical integration which, relative to standard Monte Carlo methods, offers increased sample efficiency and a more robust estimate of the uncertainty in the estimated integral. We propose a novel Bayesian Quadrature approach for numerical integration when the integrand is non-negative, such as the case of computing the marginal likelihood, predictive distribution, or normalising constant of a probabilistic model. Our approach approximately marginalises the quadrature model's hyperparameters in closed form, and introduces an active learning scheme to optimally select function evaluations, as opposed to using Monte Carlo samples. We demonstrate our method on both a number of synthetic benchmarks and a real scientific problem from astronomy.
Resumo:
We report an empirical study of n-gram posterior probability confidence measures for statistical machine translation (SMT). We first describe an efficient and practical algorithm for rapidly computing n-gram posterior probabilities from large translation word lattices. These probabilities are shown to be a good predictor of whether or not the n-gram is found in human reference translations, motivating their use as a confidence measure for SMT. Comprehensive n-gram precision and word coverage measurements are presented for a variety of different language pairs, domains and conditions. We analyze the effect on reference precision of using single or multiple references, and compare the precision of posteriors computed from k-best lists to those computed over the full evidence space of the lattice. We also demonstrate improved confidence by combining multiple lattices in a multi-source translation framework. © 2012 The Author(s).
Resumo:
The ground movements induced by the construction of supported excavation systems are generally predicted by empirical/semi-empirical methods in the design stage. However, these methods cannot account for the site-specific conditions and for information that becomes available as an excavation proceeds. A Bayesian updating methodology is proposed to update the predictions of ground movements in the later stages of excavation based on recorded deformation measurements. As an application, the proposed framework is used to predict the three-dimensional deformation shapes at four incremental excavation stages of an actual supported excavation project. © 2011 Taylor & Francis Group, London.
Resumo:
The design and construction of deep excavations in urban environment is often governed by serviceability limit state related to the risk of damage to adjacent buildings. In current practice, the assessment of excavation-induced building damage has focused on a deterministic approach. This paper presents a component/system reliability analysis framework to assess the probability that specified threshold design criteria for multiple serviceability limit states are exceeded. A recently developed Bayesian probabilistic framework is used to update the predictions of ground movements in the later stages of excavation based on the recorded deformation measurements. An example is presented to show how the serviceability performance for excavation problems can be assessed based on the component/system reliability analysis. © 2011 ASCE.
Resumo:
The ground movements induced by the construction of supported excavation systems are generally predicted in the design stage by empirical/semi-empirical methods. However, these methods cannot account for the site-specific conditions and for information that become available as an excavation proceeds. A Bayesian updating methodology is proposed to update the predictions of ground movements in the later stages of excavation based on recorded deformation measurements. As an application, the proposed framework is used to predict the three-dimensional deformation shapes at four incremental excavation stages of an actual supported excavation project. Copyright © ASCE 2011.