76 resultados para Density Estimation, Gibbs Sampling, Markov Chain Monte Carlo, Markov Random Field, Metropolis-Hastings Algorithm, Posterior Simulation, Perturbation Parameters

em CentAUR: Central Archive University of Reading - UK


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article presents a statistical method for detecting recombination in DNA sequence alignments, which is based on combining two probabilistic graphical models: (1) a taxon graph (phylogenetic tree) representing the relationship between the taxa, and (2) a site graph (hidden Markov model) representing interactions between different sites in the DNA sequence alignments. We adopt a Bayesian approach and sample the parameters of the model from the posterior distribution with Markov chain Monte Carlo, using a Metropolis-Hastings and Gibbs-within-Gibbs scheme. The proposed method is tested on various synthetic and real-world DNA sequence alignments, and we compare its performance with the established detection methods RECPARS, PLATO, and TOPAL, as well as with two alternative parameter estimation schemes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Many well-established statistical methods in genetics were developed in a climate of severe constraints on computational power. Recent advances in simulation methodology now bring modern, flexible statistical methods within the reach of scientists having access to a desktop workstation. We illustrate the potential advantages now available by considering the problem of assessing departures from Hardy-Weinberg (HW) equilibrium. Several hypothesis tests of HW have been established, as well as a variety of point estimation methods for the parameter which measures departures from HW under the inbreeding model. We propose a computational, Bayesian method for assessing departures from HW, which has a number of important advantages over existing approaches. The method incorporates the effects-of uncertainty about the nuisance parameters--the allele frequencies--as well as the boundary constraints on f (which are functions of the nuisance parameters). Results are naturally presented visually, exploiting the graphics capabilities of modern computer environments to allow straightforward interpretation. Perhaps most importantly, the method is founded on a flexible, likelihood-based modelling framework, which can incorporate the inbreeding model if appropriate, but also allows the assumptions of the model to he investigated and, if necessary, relaxed. Under appropriate conditions, information can be shared across loci and, possibly, across populations, leading to more precise estimation. The advantages of the method are illustrated by application both to simulated data and to data analysed by alternative methods in the recent literature.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The rate at which a given site in a gene sequence alignment evolves over time may vary. This phenomenon-known as heterotachy-can bias or distort phylogenetic trees inferred from models of sequence evolution that assume rates of evolution are constant. Here, we describe a phylogenetic mixture model designed to accommodate heterotachy. The method sums the likelihood of the data at each site over more than one set of branch lengths on the same tree topology. A branch-length set that is best for one site may differ from the branch-length set that is best for some other site, thereby allowing different sites to have different rates of change throughout the tree. Because rate variation may not be present in all branches, we use a reversible-jump Markov chain Monte Carlo algorithm to identify those branches in which reliable amounts of heterotachy occur. We implement the method in combination with our 'pattern-heterogeneity' mixture model, applying it to simulated data and five published datasets. We find that complex evolutionary signals of heterotachy are routinely present over and above variation in the rate or pattern of evolution across sites, that the reversible-jump method requires far fewer parameters than conventional mixture models to describe it, and serves to identify the regions of the tree in which heterotachy is most pronounced. The reversible-jump procedure also removes the need for a posteriori tests of 'significance' such as the Akaike or Bayesian information criterion tests, or Bayes factors. Heterotachy has important consequences for the correct reconstruction of phylogenies as well as for tests of hypotheses that rely on accurate branch-length information. These include molecular clocks, analyses of tempo and mode of evolution, comparative studies and ancestral state reconstruction. The model is available from the authors' website, and can be used for the analysis of both nucleotide and morphological data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Varroa destructor is a parasitic mite of the Eastern honeybee Apis cerana. Fifty years ago, two distinct evolutionary lineages (Korean and Japanese) invaded the Western honeybee Apis mellifera. This haplo-diploid parasite species reproduces mainly through brother sister matings, a system which largely favors the fixation of new mutations. In a worldwide sample of 225 individuals from 21 locations collected on Western honeybees and analyzed at 19 microsatellite loci, a series of de novo mutations was observed. Using historical data concerning the invasion, this original biological system has been exploited to compare three mutation models with allele size constraints for microsatellite markers: stepwise (SMM) and generalized (GSM) mutation models, and a model with mutation rate increasing exponentially with microsatellite length (ESM). Posterior probabilities of the three models have been estimated for each locus individually using reversible jump Markov Chain Monte Carlo. The relative support of each model varies widely among loci, but the GSM is the only model that always receives at least 9% support, whatever the locus. The analysis also provides robust estimates of mutation parameters for each locus and of the divergence time of the two invasive lineages (67,000 generations with a 90% credibility interval of 35,000-174,000). With an average of 10 generations per year, this divergence time fits with the last post-glacial Korea Japan land separation. (c) 2005 Elsevier Inc. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We describe a Bayesian method for investigating correlated evolution of discrete binary traits on phylogenetic trees. The method fits a continuous-time Markov model to a pair of traits, seeking the best fitting models that describe their joint evolution on a phylogeny. We employ the methodology of reversible-jump ( RJ) Markov chain Monte Carlo to search among the large number of possible models, some of which conform to independent evolution of the two traits, others to correlated evolution. The RJ Markov chain visits these models in proportion to their posterior probabilities, thereby directly estimating the support for the hypothesis of correlated evolution. In addition, the RJ Markov chain simultaneously estimates the posterior distributions of the rate parameters of the model of trait evolution. These posterior distributions can be used to test among alternative evolutionary scenarios to explain the observed data. All results are integrated over a sample of phylogenetic trees to account for phylogenetic uncertainty. We implement the method in a program called RJ Discrete and illustrate it by analyzing the question of whether mating system and advertisement of estrus by females have coevolved in the Old World monkeys and great apes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we analyse applicability and robustness of Markov chain Monte Carlo algorithms for eigenvalue problems. We restrict our consideration to real symmetric matrices. Almost Optimal Monte Carlo (MAO) algorithms for solving eigenvalue problems are formulated. Results for the structure of both - systematic and probability error are presented. It is shown that the values of both errors can be controlled independently by different algorithmic parameters. The results present how the systematic error depends on the matrix spectrum. The analysis of the probability error is presented. It shows that the close (in some sense) the matrix under consideration is to the stochastic matrix the smaller is this error. Sufficient conditions for constructing robust and interpolation Monte Carlo algorithms are obtained. For stochastic matrices an interpolation Monte Carlo algorithm is constructed. A number of numerical tests for large symmetric dense matrices are performed in order to study experimentally the dependence of the systematic error from the structure of matrix spectrum. We also study how the probability error depends on the balancing of the matrix. (c) 2007 Elsevier Inc. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Undirected graphical models are widely used in statistics, physics and machine vision. However Bayesian parameter estimation for undirected models is extremely challenging, since evaluation of the posterior typically involves the calculation of an intractable normalising constant. This problem has received much attention, but very little of this has focussed on the important practical case where the data consists of noisy or incomplete observations of the underlying hidden structure. This paper specifically addresses this problem, comparing two alternative methodologies. In the first of these approaches particle Markov chain Monte Carlo (Andrieu et al., 2010) is used to efficiently explore the parameter space, combined with the exchange algorithm (Murray et al., 2006) for avoiding the calculation of the intractable normalising constant (a proof showing that this combination targets the correct distribution in found in a supplementary appendix online). This approach is compared with approximate Bayesian computation (Pritchard et al., 1999). Applications to estimating the parameters of Ising models and exponential random graphs from noisy data are presented. Each algorithm used in the paper targets an approximation to the true posterior due to the use of MCMC to simulate from the latent graphical model, in lieu of being able to do this exactly in general. The supplementary appendix also describes the nature of the resulting approximation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The sampling of certain solid angle is a fundamental operation in realistic image synthesis, where the rendering equation describing the light propagation in closed domains is solved. Monte Carlo methods for solving the rendering equation use sampling of the solid angle subtended by unit hemisphere or unit sphere in order to perform the numerical integration of the rendering equation. In this work we consider the problem for generation of uniformly distributed random samples over hemisphere and sphere. Our aim is to construct and study the parallel sampling scheme for hemisphere and sphere. First we apply the symmetry property for partitioning of hemisphere and sphere. The domain of solid angle subtended by a hemisphere is divided into a number of equal sub-domains. Each sub-domain represents solid angle subtended by orthogonal spherical triangle with fixed vertices and computable parameters. Then we introduce two new algorithms for sampling of orthogonal spherical triangles. Both algorithms are based on a transformation of the unit square. Similarly to the Arvo's algorithm for sampling of arbitrary spherical triangle the suggested algorithms accommodate the stratified sampling. We derive the necessary transformations for the algorithms. The first sampling algorithm generates a sample by mapping of the unit square onto orthogonal spherical triangle. The second algorithm directly compute the unit radius vector of a sampling point inside to the orthogonal spherical triangle. The sampling of total hemisphere and sphere is performed in parallel for all sub-domains simultaneously by using the symmetry property of partitioning. The applicability of the corresponding parallel sampling scheme for Monte Carlo and Quasi-D/lonte Carlo solving of rendering equation is discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we deal with performance analysis of Monte Carlo algorithm for large linear algebra problems. We consider applicability and efficiency of the Markov chain Monte Carlo for large problems, i.e., problems involving matrices with a number of non-zero elements ranging between one million and one billion. We are concentrating on analysis of the almost Optimal Monte Carlo (MAO) algorithm for evaluating bilinear forms of matrix powers since they form the so-called Krylov subspaces. Results are presented comparing the performance of the Robust and Non-robust Monte Carlo algorithms. The algorithms are tested on large dense matrices as well as on large unstructured sparse matrices.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Finding the smallest eigenvalue of a given square matrix A of order n is computationally very intensive problem. The most popular method for this problem is the Inverse Power Method which uses LU-decomposition and forward and backward solving of the factored system at every iteration step. An alternative to this method is the Resolvent Monte Carlo method which uses representation of the resolvent matrix [I -qA](-m) as a series and then performs Monte Carlo iterations (random walks) on the elements of the matrix. This leads to great savings in computations, but the method has many restrictions and a very slow convergence. In this paper we propose a method that includes fast Monte Carlo procedure for finding the inverse matrix, refinement procedure to improve approximation of the inverse if necessary, and Monte Carlo power iterations to compute the smallest eigenvalue. We provide not only theoretical estimations about accuracy and convergence but also results from numerical tests performed on a number of test matrices.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Monte Carlo algorithms often aim to draw from a distribution π by simulating a Markov chain with transition kernel P such that π is invariant under P. However, there are many situations for which it is impractical or impossible to draw from the transition kernel P. For instance, this is the case with massive datasets, where is it prohibitively expensive to calculate the likelihood and is also the case for intractable likelihood models arising from, for example, Gibbs random fields, such as those found in spatial statistics and network analysis. A natural approach in these cases is to replace P by an approximation Pˆ. Using theory from the stability of Markov chains we explore a variety of situations where it is possible to quantify how ’close’ the chain given by the transition kernel Pˆ is to the chain given by P . We apply these results to several examples from spatial statistics and network analysis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The steadily accumulating literature on technical efficiency in fisheries attests to the importance of efficiency as an indicator of fleet condition and as an object of management concern. In this paper, we extend previous work by presenting a Bayesian hierarchical approach that yields both efficiency estimates and, as a byproduct of the estimation algorithm, probabilistic rankings of the relative technical efficiencies of fishing boats. The estimation algorithm is based on recent advances in Markov Chain Monte Carlo (MCMC) methods—Gibbs sampling, in particular—which have not been widely used in fisheries economics. We apply the method to a sample of 10,865 boat trips in the US Pacific hake (or whiting) fishery during 1987–2003. We uncover systematic differences between efficiency rankings based on sample mean efficiency estimates and those that exploit the full posterior distributions of boat efficiencies to estimate the probability that a given boat has the highest true mean efficiency.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Biologists frequently attempt to infer the character states at ancestral nodes of a phylogeny from the distribution of traits observed in contemporary organisms. Because phylogenies are normally inferences from data, it is desirable to account for the uncertainty in estimates of the tree and its branch lengths when making inferences about ancestral states or other comparative parameters. Here we present a general Bayesian approach for testing comparative hypotheses across statistically justified samples of phylogenies, focusing on the specific issue of reconstructing ancestral states. The method uses Markov chain Monte Carlo techniques for sampling phylogenetic trees and for investigating the parameters of a statistical model of trait evolution. We describe how to combine information about the uncertainty of the phylogeny with uncertainty in the estimate of the ancestral state. Our approach does not constrain the sample of trees only to those that contain the ancestral node or nodes of interest, and we show how to reconstruct ancestral states of uncertain nodes using a most-recent-common-ancestor approach. We illustrate the methods with data on ribonuclease evolution in the Artiodactyla. Software implementing the methods ( BayesMultiState) is available from the authors.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article introduces a new general method for genealogical inference that samples independent genealogical histories using importance sampling (IS) and then samples other parameters with Markov chain Monte Carlo (MCMC). It is then possible to more easily utilize the advantages of importance sampling in a fully Bayesian framework. The method is applied to the problem of estimating recent changes in effective population size from temporally spaced gene frequency data. The method gives the posterior distribution of effective population size at the time of the oldest sample and at the time of the most recent sample, assuming a model of exponential growth or decline during the interval. The effect of changes in number of alleles, number of loci, and sample size on the accuracy of the method is described using test simulations, and it is concluded that these have an approximately equivalent effect. The method is used on three example data sets and problems in interpreting the posterior densities are highlighted and discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The steadily accumulating literature on technical efficiency in fisheries attests to the importance of efficiency as an indicator of fleet condition and as an object of management concern. In this paper, we extend previous work by presenting a Bayesian hierarchical approach that yields both efficiency estimates and, as a byproduct of the estimation algorithm, probabilistic rankings of the relative technical efficiencies of fishing boats. The estimation algorithm is based on recent advances in Markov Chain Monte Carlo (MCMC) methods— Gibbs sampling, in particular—which have not been widely used in fisheries economics. We apply the method to a sample of 10,865 boat trips in the US Pacific hake (or whiting) fishery during 1987–2003. We uncover systematic differences between efficiency rankings based on sample mean efficiency estimates and those that exploit the full posterior distributions of boat efficiencies to estimate the probability that a given boat has the highest true mean efficiency.