289 resultados para Approximate Bayesian Computation
Resumo:
We estimate the parameters of a stochastic process model for a macroparasite population within a host using approximate Bayesian computation (ABC). The immunity of the host is an unobserved model variable and only mature macroparasites at sacrifice of the host are counted. With very limited data, process rates are inferred reasonably precisely. Modeling involves a three variable Markov process for which the observed data likelihood is computationally intractable. ABC methods are particularly useful when the likelihood is analytically or computationally intractable. The ABC algorithm we present is based on sequential Monte Carlo, is adaptive in nature, and overcomes some drawbacks of previous approaches to ABC. The algorithm is validated on a test example involving simulated data from an autologistic model before being used to infer parameters of the Markov process model for experimental data. The fitted model explains the observed extra-binomial variation in terms of a zero-one immunity variable, which has a short-lived presence in the host.
Resumo:
We present a novel approach for developing summary statistics for use in approximate Bayesian computation (ABC) algorithms by using indirect inference. ABC methods are useful for posterior inference in the presence of an intractable likelihood function. In the indirect inference approach to ABC the parameters of an auxiliary model fitted to the data become the summary statistics. Although applicable to any ABC technique, we embed this approach within a sequential Monte Carlo algorithm that is completely adaptive and requires very little tuning. This methodological development was motivated by an application involving data on macroparasite population evolution modelled by a trivariate stochastic process for which there is no tractable likelihood function. The auxiliary model here is based on a beta–binomial distribution. The main objective of the analysis is to determine which parameters of the stochastic model are estimable from the observed data on mature parasite worms.
Resumo:
Modelling an environmental process involves creating a model structure and parameterising the model with appropriate values to accurately represent the process. Determining accurate parameter values for environmental systems can be challenging. Existing methods for parameter estimation typically make assumptions regarding the form of the Likelihood, and will often ignore any uncertainty around estimated values. This can be problematic, however, particularly in complex problems where Likelihoods may be intractable. In this paper we demonstrate an Approximate Bayesian Computational method for the estimation of parameters of a stochastic CA. We use as an example a CA constructed to simulate a range expansion such as might occur after a biological invasion, making parameter estimates using only count data such as could be gathered from field observations. We demonstrate ABC is a highly useful method for parameter estimation, with accurate estimates of parameters that are important for the management of invasive species such as the intrinsic rate of increase and the point in a landscape where a species has invaded. We also show that the method is capable of estimating the probability of long distance dispersal, a characteristic of biological invasions that is very influential in determining spread rates but has until now proved difficult to estimate accurately.
Resumo:
In this paper, we apply a simulation based approach for estimating transmission rates of nosocomial pathogens. In particular, the objective is to infer the transmission rate between colonised health-care practitioners and uncolonised patients (and vice versa) solely from routinely collected incidence data. The method, using approximate Bayesian computation, is substantially less computer intensive and easier to implement than likelihood-based approaches we refer to here. We find through replacing the likelihood with a comparison of an efficient summary statistic between observed and simulated data that little is lost in the precision of estimated transmission rates. Furthermore, we investigate the impact of incorporating uncertainty in previously fixed parameters on the precision of the estimated transmission rates.
Resumo:
This is a discussion of the journal article: "Construcing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation". The article and discussion have appeared in the Journal of the Royal Statistical Society: Series B (Statistical Methodology).
Resumo:
We present a novel approach for developing summary statistics for use in approximate Bayesian computation (ABC) algorithms using indirect infer- ence. We embed this approach within a sequential Monte Carlo algorithm that is completely adaptive. This methodological development was motivated by an application involving data on macroparasite population evolution modelled with a trivariate Markov process. The main objective of the analysis is to compare inferences on the Markov process when considering two di®erent indirect mod- els. The two indirect models are based on a Beta-Binomial model and a three component mixture of Binomials, with the former providing a better ¯t to the observed data.
Resumo:
The spatiotemporal dynamics of an alien species invasion across a real landscape are typically complex. While surveillance is an essential part of a management response, planning surveillance in space and time present a difficult challenge due to this complexity. We show here a method for determining the highest probability sites for occupancy across a landscape at an arbitrary point in the future, based on occupancy data from a single slice in time. We apply to the method to the invasion of Giant Hogweed, a serious weed in the Czech republic and throughout Europe.
Resumo:
Quantifying the impact of biochemical compounds on collective cell spreading is an essential element of drug design, with various applications including developing treatments for chronic wounds and cancer. Scratch assays are a technically simple and inexpensive method used to study collective cell spreading; however, most previous interpretations of scratch assays are qualitative and do not provide estimates of the cell diffusivity, D, or the cell proliferation rate,l. Estimating D and l is important for investigating the efficacy of a potential treatment and provides insight into the mechanism through which the potential treatment acts. While a few methods for estimating D and l have been proposed, these previous methods lead to point estimates of D and l, and provide no insight into the uncertainty in these estimates. Here, we compare various types of information that can be extracted from images of a scratch assay, and quantify D and l using discrete computational simulations and approximate Bayesian computation. We show that it is possible to robustly recover estimates of D and l from synthetic data, as well as a new set of experimental data. For the first time, our approach also provides a method to estimate the uncertainty in our estimates of D and l. We anticipate that our approach can be generalized to deal with more realistic experimental scenarios in which we are interested in estimating D and l, as well as additional relevant parameters such as the strength of cell-to-cell adhesion or the strength of cell-to-substrate adhesion.
Resumo:
Approximate Bayesian Computation’ (ABC) represents a powerful methodology for the analysis of complex stochastic systems for which the likelihood of the observed data under an arbitrary set of input parameters may be entirely intractable – the latter condition rendering useless the standard machinery of tractable likelihood-based, Bayesian statistical inference [e.g. conventional Markov chain Monte Carlo (MCMC) simulation]. In this paper, we demonstrate the potential of ABC for astronomical model analysis by application to a case study in the morphological transformation of high-redshift galaxies. To this end, we develop, first, a stochastic model for the competing processes of merging and secular evolution in the early Universe, and secondly, through an ABC-based comparison against the observed demographics of massive (Mgal > 1011 M⊙) galaxies (at 1.5 < z < 3) in the Cosmic Assembly Near-IR Deep Extragalatic Legacy Survey (CANDELS)/Extended Groth Strip (EGS) data set we derive posterior probability densities for the key parameters of this model. The ‘Sequential Monte Carlo’ implementation of ABC exhibited herein, featuring both a self-generating target sequence and self-refining MCMC kernel, is amongst the most efficient of contemporary approaches to this important statistical algorithm. We highlight as well through our chosen case study the value of careful summary statistic selection, and demonstrate two modern strategies for assessment and optimization in this regard. Ultimately, our ABC analysis of the high-redshift morphological mix returns tight constraints on the evolving merger rate in the early Universe and favours major merging (with disc survival or rapid reformation) over secular evolution as the mechanism most responsible for building up the first generation of bulges in early-type discs.
Resumo:
Analytically or computationally intractable likelihood functions can arise in complex statistical inferential problems making them inaccessible to standard Bayesian inferential methods. Approximate Bayesian computation (ABC) methods address such inferential problems by replacing direct likelihood evaluations with repeated sampling from the model. ABC methods have been predominantly applied to parameter estimation problems and less to model choice problems due to the added difficulty of handling multiple model spaces. The ABC algorithm proposed here addresses model choice problems by extending Fearnhead and Prangle (2012, Journal of the Royal Statistical Society, Series B 74, 1–28) where the posterior mean of the model parameters estimated through regression formed the summary statistics used in the discrepancy measure. An additional stepwise multinomial logistic regression is performed on the model indicator variable in the regression step and the estimated model probabilities are incorporated into the set of summary statistics for model choice purposes. A reversible jump Markov chain Monte Carlo step is also included in the algorithm to increase model diversity for thorough exploration of the model space. This algorithm was applied to a validating example to demonstrate the robustness of the algorithm across a wide range of true model probabilities. Its subsequent use in three pathogen transmission examples of varying complexity illustrates the utility of the algorithm in inferring preference of particular transmission models for the pathogens.
Resumo:
Most of the existing algorithms for approximate Bayesian computation (ABC) assume that it is feasible to simulate pseudo-data from the model at each iteration. However, the computational cost of these simulations can be prohibitive for high dimensional data. An important example is the Potts model, which is commonly used in image analysis. Images encountered in real world applications can have millions of pixels, therefore scalability is a major concern. We apply ABC with a synthetic likelihood to the hidden Potts model with additive Gaussian noise. Using a pre-processing step, we fit a binding function to model the relationship between the model parameters and the synthetic likelihood parameters. Our numerical experiments demonstrate that the precomputed binding function dramatically improves the scalability of ABC, reducing the average runtime required for model fitting from 71 hours to only 7 minutes. We also illustrate the method by estimating the smoothing parameter for remotely sensed satellite imagery. Without precomputation, Bayesian inference is impractical for datasets of that scale.
Resumo:
Wound healing and tumour growth involve collective cell spreading, which is driven by individual motility and proliferation events within a population of cells. Mathematical models are often used to interpret experimental data and to estimate the parameters so that predictions can be made. Existing methods for parameter estimation typically assume that these parameters are constants and often ignore any uncertainty in the estimated values. We use approximate Bayesian computation (ABC) to estimate the cell diffusivity, D, and the cell proliferation rate, λ, from a discrete model of collective cell spreading, and we quantify the uncertainty associated with these estimates using Bayesian inference. We use a detailed experimental data set describing the collective cell spreading of 3T3 fibroblast cells. The ABC analysis is conducted for different combinations of initial cell densities and experimental times in two separate scenarios: (i) where collective cell spreading is driven by cell motility alone, and (ii) where collective cell spreading is driven by combined cell motility and cell proliferation. We find that D can be estimated precisely, with a small coefficient of variation (CV) of 2–6%. Our results indicate that D appears to depend on the experimental time, which is a feature that has been previously overlooked. Assuming that the values of D are the same in both experimental scenarios, we use the information about D from the first experimental scenario to obtain reasonably precise estimates of λ, with a CV between 4 and 12%. Our estimates of D and λ are consistent with previously reported values; however, our method is based on a straightforward measurement of the position of the leading edge whereas previous approaches have involved expensive cell counting techniques. Additional insights gained using a fully Bayesian approach justify the computational cost, especially since it allows us to accommodate information from different experiments in a principled way.
Resumo:
In vitro studies and mathematical models are now being widely used to study the underlying mechanisms driving the expansion of cell colonies. This can improve our understanding of cancer formation and progression. Although much progress has been made in terms of developing and analysing mathematical models, far less progress has been made in terms of understanding how to estimate model parameters using experimental in vitro image-based data. To address this issue, a new approximate Bayesian computation (ABC) algorithm is proposed to estimate key parameters governing the expansion of melanoma cell (MM127) colonies, including cell diffusivity, D, cell proliferation rate, λ, and cell-to-cell adhesion, q, in two experimental scenarios, namely with and without a chemical treatment to suppress cell proliferation. Even when little prior biological knowledge about the parameters is assumed, all parameters are precisely inferred with a small posterior coefficient of variation, approximately 2–12%. The ABC analyses reveal that the posterior distributions of D and q depend on the experimental elapsed time, whereas the posterior distribution of λ does not. The posterior mean values of D and q are in the ranges 226–268 µm2h−1, 311–351 µm2h−1 and 0.23–0.39, 0.32–0.61 for the experimental periods of 0–24 h and 24–48 h, respectively. Furthermore, we found that the posterior distribution of q also depends on the initial cell density, whereas the posterior distributions of D and λ do not. The ABC approach also enables information from the two experiments to be combined, resulting in greater precision for all estimates of D and λ.
Resumo:
Approximate Bayesian computation has become an essential tool for the analysis of complex stochastic models when the likelihood function is numerically unavailable. However, the well-established statistical method of empirical likelihood provides another route to such settings that bypasses simulations from the model and the choices of the approximate Bayesian computation parameters (summary statistics, distance, tolerance), while being convergent in the number of observations. Furthermore, bypassing model simulations may lead to significant time savings in complex models, for instance those found in population genetics. The Bayesian computation with empirical likelihood algorithm we develop in this paper also provides an evaluation of its own performance through an associated effective sample size. The method is illustrated using several examples, including estimation of standard distributions, time series, and population genetics models.
Resumo:
In this paper we present a new simulation methodology in order to obtain exact or approximate Bayesian inference for models for low-valued count time series data that have computationally demanding likelihood functions. The algorithm fits within the framework of particle Markov chain Monte Carlo (PMCMC) methods. The particle filter requires only model simulations and, in this regard, our approach has connections with approximate Bayesian computation (ABC). However, an advantage of using the PMCMC approach in this setting is that simulated data can be matched with data observed one-at-a-time, rather than attempting to match on the full dataset simultaneously or on a low-dimensional non-sufficient summary statistic, which is common practice in ABC. For low-valued count time series data we find that it is often computationally feasible to match simulated data with observed data exactly. Our particle filter maintains $N$ particles by repeating the simulation until $N+1$ exact matches are obtained. Our algorithm creates an unbiased estimate of the likelihood, resulting in exact posterior inferences when included in an MCMC algorithm. In cases where exact matching is computationally prohibitive, a tolerance is introduced as per ABC. A novel aspect of our approach is that we introduce auxiliary variables into our particle filter so that partially observed and/or non-Markovian models can be accommodated. We demonstrate that Bayesian model choice problems can be easily handled in this framework.