46 resultados para Hierarchical Bayesian Methods
Resumo:
Genetic data obtained on population samples convey information about their evolutionary history. Inference methods can extract part of this information but they require sophisticated statistical techniques that have been made available to the biologist community (through computer programs) only for simple and standard situations typically involving a small number of samples. We propose here a computer program (DIY ABC) for inference based on approximate Bayesian computation (ABC), in which scenarios can be customized by the user to fit many complex situations involving any number of populations and samples. Such scenarios involve any combination of population divergences, admixtures and population size changes. DIY ABC can be used to compare competing scenarios, estimate parameters for one or more scenarios and compute bias and precision measures for a given scenario and known values of parameters (the current version applies to unlinked microsatellite data). This article describes key methods used in the program and provides its main features. The analysis of one simulated and one real dataset, both with complex evolutionary scenarios, illustrates the main possibilities of DIY ABC.
Resumo:
There is great interest in using amplified fragment length polymorphism (AFLP) markers because they are inexpensive and easy to produce. It is, therefore, possible to generate a large number of markers that have a wide coverage of species genotnes. Several statistical methods have been proposed to study the genetic structure using AFLP's but they assume Hardy-Weinberg equilibrium and do not estimate the inbreeding coefficient, F-IS. A Bayesian method has been proposed by Holsinger and colleagues that relaxes these simplifying assumptions but we have identified two sources of bias that can influence estimates based on these markers: (i) the use of a uniform prior on ancestral allele frequencies and (ii) the ascertainment bias of AFLP markers. We present a new Bayesian method that avoids these biases by using an implementation based on the approximate Bayesian computation (ABC) algorithm. This new method estimates population-specific F-IS and F-ST values and offers users the possibility of taking into account the criteria for selecting the markers that are used in the analyses. The software is available at our web site (http://www-leca.uif-grenoble.fi-/logiciels.htm). Finally, we provide advice on how to avoid the effects of ascertainment bias.
Resumo:
In this paper, Bayesian decision procedures are developed for dose-escalation studies based on binary measures of undesirable events and continuous measures of therapeutic benefit. The methods generalize earlier approaches where undesirable events and therapeutic benefit are both binary. A logistic regression model is used to model the binary responses, while a linear regression model is used to model the continuous responses. Prior distributions for the unknown model parameters are suggested. A gain function is discussed and an optional safety constraint is included. Copyright (C) 2006 John Wiley & Sons, Ltd.
Resumo:
Approximate Bayesian computation (ABC) is a highly flexible technique that allows the estimation of parameters under demographic models that are too complex to be handled by full-likelihood methods. We assess the utility of this method to estimate the parameters of range expansion in a two-dimensional stepping-stone model, using samples from either a single deme or multiple demes. A minor modification to the ABC procedure is introduced, which leads to an improvement in the accuracy of estimation. The method is then used to estimate the expansion time and migration rates for five natural common vole populations in Switzerland typed for a sex-linked marker and a nuclear marker. Estimates based on both markers suggest that expansion occurred < 10,000 years ago, after the most recent glaciation, and that migration rates are strongly male biased.
Resumo:
Biologists frequently attempt to infer the character states at ancestral nodes of a phylogeny from the distribution of traits observed in contemporary organisms. Because phylogenies are normally inferences from data, it is desirable to account for the uncertainty in estimates of the tree and its branch lengths when making inferences about ancestral states or other comparative parameters. Here we present a general Bayesian approach for testing comparative hypotheses across statistically justified samples of phylogenies, focusing on the specific issue of reconstructing ancestral states. The method uses Markov chain Monte Carlo techniques for sampling phylogenetic trees and for investigating the parameters of a statistical model of trait evolution. We describe how to combine information about the uncertainty of the phylogeny with uncertainty in the estimate of the ancestral state. Our approach does not constrain the sample of trees only to those that contain the ancestral node or nodes of interest, and we show how to reconstruct ancestral states of uncertain nodes using a most-recent-common-ancestor approach. We illustrate the methods with data on ribonuclease evolution in the Artiodactyla. Software implementing the methods ( BayesMultiState) is available from the authors.
Resumo:
Purpose: Acquiring details of kinetic parameters of enzymes is crucial to biochemical understanding, drug development, and clinical diagnosis in ocular diseases. The correct design of an experiment is critical to collecting data suitable for analysis, modelling and deriving the correct information. As classical design methods are not targeted to the more complex kinetics being frequently studied, attention is needed to estimate parameters of such models with low variance. Methods: We have developed Bayesian utility functions to minimise kinetic parameter variance involving differentiation of model expressions and matrix inversion. These have been applied to the simple kinetics of the enzymes in the glyoxalase pathway (of importance in posttranslational modification of proteins in cataract), and the complex kinetics of lens aldehyde dehydrogenase (also of relevance to cataract). Results: Our successful application of Bayesian statistics has allowed us to identify a set of rules for designing optimum kinetic experiments iteratively. Most importantly, the distribution of points in the range is critical; it is not simply a matter of even or multiple increases. At least 60 % must be below the KM (or plural if more than one dissociation constant) and 40% above. This choice halves the variance found using a simple even spread across the range.With both the glyoxalase system and lens aldehyde dehydrogenase we have significantly improved the variance of kinetic parameter estimation while reducing the number and costs of experiments. Conclusions: We have developed an optimal and iterative method for selecting features of design such as substrate range, number of measurements and choice of intermediate points. Our novel approach minimises parameter error and costs, and maximises experimental efficiency. It is applicable to many areas of ocular drug design, including receptor-ligand binding and immunoglobulin binding, and should be an important tool in ocular drug discovery.
Resumo:
This article presents a statistical method for detecting recombination in DNA sequence alignments, which is based on combining two probabilistic graphical models: (1) a taxon graph (phylogenetic tree) representing the relationship between the taxa, and (2) a site graph (hidden Markov model) representing interactions between different sites in the DNA sequence alignments. We adopt a Bayesian approach and sample the parameters of the model from the posterior distribution with Markov chain Monte Carlo, using a Metropolis-Hastings and Gibbs-within-Gibbs scheme. The proposed method is tested on various synthetic and real-world DNA sequence alignments, and we compare its performance with the established detection methods RECPARS, PLATO, and TOPAL, as well as with two alternative parameter estimation schemes.
Resumo:
In areas such as drug development, clinical diagnosis and biotechnology research, acquiring details about the kinetic parameters of enzymes is crucial. The correct design of an experiment is critical to collecting data suitable for analysis, modelling and deriving the correct information. As classical design methods are not targeted to the more complex kinetics being frequently studied, attention is needed to estimate parameters of such models with low variance. We demonstrate that a Bayesian approach (the use of prior knowledge) can produce major gains quantifiable in terms of information, productivity and accuracy of each experiment. Developing the use of Bayesian Utility functions, we have used a systematic method to identify the optimum experimental designs for a number of kinetic model data sets. This has enabled the identification of trends between kinetic model types, sets of design rules and the key conclusion that such designs should be based on some prior knowledge of K-M and/or the kinetic model. We suggest an optimal and iterative method for selecting features of the design such as the substrate range, number of measurements and choice of intermediate points. The final design collects data suitable for accurate modelling and analysis and minimises the error in the parameters estimated. (C) 2003 Elsevier Science B.V. All rights reserved.
Resumo:
We have developed a new Bayesian approach to retrieve oceanic rain rate from the Tropical Rainfall Measuring Mission (TRMM) Microwave Imager (TMI), with an emphasis on typhoon cases in the West Pacific. Retrieved rain rates are validated with measurements of rain gauges located on Japanese islands. To demonstrate improvement, retrievals are also compared with those from the TRMM/Precipitation Radar (PR), the Goddard Profiling Algorithm (GPROF), and a multi-channel linear regression statistical method (MLRS). We have found that qualitatively, all methods retrieved similar horizontal distributions in terms of locations of eyes and rain bands of typhoons. Quantitatively, our new Bayesian retrievals have the best linearity and the smallest root mean square (RMS) error against rain gauge data for 16 typhoon overpasses in 2004. The correlation coefficient and RMS of our retrievals are 0.95 and ~2 mm hr-1, respectively. In particular, at heavy rain rates, our Bayesian retrievals outperform those retrieved from GPROF and MLRS. Overall, the new Bayesian approach accurately retrieves surface rain rate for typhoon cases. Accurate rain rate estimates from this method can be assimilated in models to improve forecast and prevent potential damages in Taiwan during typhoon seasons.
Resumo:
This article shows how the solution to the promotion problem—the problem of locating the optimal level of advertising in a downstream market—can be derived simply, empirically, and robustly through the application of some simple calculus and Bayesian econometrics. We derive the complete distribution of the level of promotion that maximizes producer surplus and generate recommendations about patterns as well as levels of expenditure that increase net returns. The theory and methods are applied to quarterly series (1978:2S1988:4) on red meats promotion by the Australian Meat and Live-Stock Corporation. A slightly different pattern of expenditure would have profited lamb producers
Resumo:
Many modern statistical applications involve inference for complex stochastic models, where it is easy to simulate from the models, but impossible to calculate likelihoods. Approximate Bayesian computation (ABC) is a method of inference for such models. It replaces calculation of the likelihood by a step which involves simulating artificial data for different parameter values, and comparing summary statistics of the simulated data with summary statistics of the observed data. Here we show how to construct appropriate summary statistics for ABC in a semi-automatic manner. We aim for summary statistics which will enable inference about certain parameters of interest to be as accurate as possible. Theoretical results show that optimal summary statistics are the posterior means of the parameters. Although these cannot be calculated analytically, we use an extra stage of simulation to estimate how the posterior means vary as a function of the data; and we then use these estimates of our summary statistics within ABC. Empirical results show that our approach is a robust method for choosing summary statistics that can result in substantially more accurate ABC analyses than the ad hoc choices of summary statistics that have been proposed in the literature. We also demonstrate advantages over two alternative methods of simulation-based inference.
Resumo:
Many applications, such as intermittent data assimilation, lead to a recursive application of Bayesian inference within a Monte Carlo context. Popular data assimilation algorithms include sequential Monte Carlo methods and ensemble Kalman filters (EnKFs). These methods differ in the way Bayesian inference is implemented. Sequential Monte Carlo methods rely on importance sampling combined with a resampling step, while EnKFs utilize a linear transformation of Monte Carlo samples based on the classic Kalman filter. While EnKFs have proven to be quite robust even for small ensemble sizes, they are not consistent since their derivation relies on a linear regression ansatz. In this paper, we propose another transform method, which does not rely on any a priori assumptions on the underlying prior and posterior distributions. The new method is based on solving an optimal transportation problem for discrete random variables. © 2013, Society for Industrial and Applied Mathematics
Resumo:
Bayesian analysis is given of an instrumental variable model that allows for heteroscedasticity in both the structural equation and the instrument equation. Specifically, the approach for dealing with heteroscedastic errors in Geweke (1993) is extended to the Bayesian instrumental variable estimator outlined in Rossi et al. (2005). Heteroscedasticity is treated by modelling the variance for each error using a hierarchical prior that is Gamma distributed. The computation is carried out by using a Markov chain Monte Carlo sampling algorithm with an augmented draw for the heteroscedastic case. An example using real data illustrates the approach and shows that ignoring heteroscedasticity in the instrument equation when it exists may lead to biased estimates.
Resumo:
Models for which the likelihood function can be evaluated only up to a parameter-dependent unknown normalizing constant, such as Markov random field models, are used widely in computer science, statistical physics, spatial statistics, and network analysis. However, Bayesian analysis of these models using standard Monte Carlo methods is not possible due to the intractability of their likelihood functions. Several methods that permit exact, or close to exact, simulation from the posterior distribution have recently been developed. However, estimating the evidence and Bayes’ factors for these models remains challenging in general. This paper describes new random weight importance sampling and sequential Monte Carlo methods for estimating BFs that use simulation to circumvent the evaluation of the intractable likelihood, and compares them to existing methods. In some cases we observe an advantage in the use of biased weight estimates. An initial investigation into the theoretical and empirical properties of this class of methods is presented. Some support for the use of biased estimates is presented, but we advocate caution in the use of such estimates.
Resumo:
Approximate Bayesian computation (ABC) is a popular family of algorithms which perform approximate parameter inference when numerical evaluation of the likelihood function is not possible but data can be simulated from the model. They return a sample of parameter values which produce simulations close to the observed dataset. A standard approach is to reduce the simulated and observed datasets to vectors of summary statistics and accept when the difference between these is below a specified threshold. ABC can also be adapted to perform model choice. In this article, we present a new software package for R, abctools which provides methods for tuning ABC algorithms. This includes recent dimension reduction algorithms to tune the choice of summary statistics, and coverage methods to tune the choice of threshold. We provide several illustrations of these routines on applications taken from the ABC literature.