36 resultados para Delayed Markov chain


Relevância:

80.00% 80.00%

Publicador:

Resumo:

The identification of signatures of natural selection in genomic surveys has become an area of intense research, stimulated by the increasing ease with which genetic markers can be typed. Loci identified as subject to selection may be functionally important, and hence (weak) candidates for involvement in disease causation. They can also be useful in determining the adaptive differentiation of populations, and exploring hypotheses about speciation. Adaptive differentiation has traditionally been identified from differences in allele frequencies among different populations, summarised by an estimate of F-ST. Low outliers relative to an appropriate neutral population-genetics model indicate loci subject to balancing selection, whereas high outliers suggest adaptive (directional) selection. However, the problem of identifying statistically significant departures from neutrality is complicated by confounding effects on the distribution of F-ST estimates, and current methods have not yet been tested in large-scale simulation experiments. Here, we simulate data from a structured population at many unlinked, diallelic loci that are predominantly neutral but with some loci subject to adaptive or balancing selection. We develop a hierarchical-Bayesian method, implemented via Markov chain Monte Carlo (MCMC), and assess its performance in distinguishing the loci simulated under selection from the neutral loci. We also compare this performance with that of a frequentist method, based on moment-based estimates of F-ST. We find that both methods can identify loci subject to adaptive selection when the selection coefficient is at least five times the migration rate. Neither method could reliably distinguish loci under balancing selection in our simulations, even when the selection coefficient is twenty times the migration rate.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Biologists frequently attempt to infer the character states at ancestral nodes of a phylogeny from the distribution of traits observed in contemporary organisms. Because phylogenies are normally inferences from data, it is desirable to account for the uncertainty in estimates of the tree and its branch lengths when making inferences about ancestral states or other comparative parameters. Here we present a general Bayesian approach for testing comparative hypotheses across statistically justified samples of phylogenies, focusing on the specific issue of reconstructing ancestral states. The method uses Markov chain Monte Carlo techniques for sampling phylogenetic trees and for investigating the parameters of a statistical model of trait evolution. We describe how to combine information about the uncertainty of the phylogeny with uncertainty in the estimate of the ancestral state. Our approach does not constrain the sample of trees only to those that contain the ancestral node or nodes of interest, and we show how to reconstruct ancestral states of uncertain nodes using a most-recent-common-ancestor approach. We illustrate the methods with data on ribonuclease evolution in the Artiodactyla. Software implementing the methods ( BayesMultiState) is available from the authors.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We describe a general likelihood-based 'mixture model' for inferring phylogenetic trees from gene-sequence or other character-state data. The model accommodates cases in which different sites in the alignment evolve in qualitatively distinct ways, but does not require prior knowledge of these patterns or partitioning of the data. We call this qualitative variability in the pattern of evolution across sites "pattern-heterogeneity" to distinguish it from both a homogenous process of evolution and from one characterized principally by differences in rates of evolution. We present studies to show that the model correctly retrieves the signals of pattern-heterogeneity from simulated gene-sequence data, and we apply the method to protein-coding genes and to a ribosomal 12S data set. The mixture model outperforms conventional partitioning in both these data sets. We implement the mixture model such that it can simultaneously detect rate- and pattern-heterogeneity. The model simplifies to a homogeneous model or a rate- variability model as special cases, and therefore always performs at least as well as these two approaches, and often considerably improves upon them. We make the model available within a Bayesian Markov-chain Monte Carlo framework for phylogenetic inference, as an easy-to-use computer program.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This article introduces a new general method for genealogical inference that samples independent genealogical histories using importance sampling (IS) and then samples other parameters with Markov chain Monte Carlo (MCMC). It is then possible to more easily utilize the advantages of importance sampling in a fully Bayesian framework. The method is applied to the problem of estimating recent changes in effective population size from temporally spaced gene frequency data. The method gives the posterior distribution of effective population size at the time of the oldest sample and at the time of the most recent sample, assuming a model of exponential growth or decline during the interval. The effect of changes in number of alleles, number of loci, and sample size on the accuracy of the method is described using test simulations, and it is concluded that these have an approximately equivalent effect. The method is used on three example data sets and problems in interpreting the posterior densities are highlighted and discussed.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Analyses of high-density single-nucleotide polymorphism (SNP) data, such as genetic mapping and linkage disequilibrium (LD) studies, require phase-known haplotypes to allow for the correlation between tightly linked loci. However, current SNP genotyping technology cannot determine phase, which must be inferred statistically. In this paper, we present a new Bayesian Markov chain Monte Carlo (MCMC) algorithm for population haplotype frequency estimation, particulary in the context of LD assessment. The novel feature of the method is the incorporation of a log-linear prior model for population haplotype frequencies. We present simulations to suggest that 1) the log-linear prior model is more appropriate than the standard coalescent process in the presence of recombination (>0.02cM between adjacent loci), and 2) there is substantial inflation in measures of LD obtained by a "two-stage" approach to the analysis by treating the "best" haplotype configuration as correct, without regard to uncertainty in the recombination process. Genet Epidemiol 25:106-114, 2003. (C) 2003 Wiley-Liss, Inc.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Population subdivision complicates analysis of molecular variation. Even if neutrality is assumed, three evolutionary forces need to be considered: migration, mutation, and drift. Simplification can be achieved by assuming that the process of migration among and drift within subpopulations is occurring fast compared to Mutation and drift in the entire population. This allows a two-step approach in the analysis: (i) analysis of population subdivision and (ii) analysis of molecular variation in the migrant pool. We model population subdivision using an infinite island model, where we allow the migration/drift parameter Theta to vary among populations. Thus, central and peripheral populations can be differentiated. For inference of Theta, we use a coalescence approach, implemented via a Markov chain Monte Carlo (MCMC) integration method that allows estimation of allele frequencies in the migrant pool. The second step of this approach (analysis of molecular variation in the migrant pool) uses the estimated allele frequencies in the migrant pool for the study of molecular variation. We apply this method to a Drosophila ananassae sequence data set. We find little indication of isolation by distance, but large differences in the migration parameter among populations. The population as a whole seems to be expanding. A population from Bogor (Java, Indonesia) shows the highest variation and seems closest to the species center.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper we deal with performance analysis of Monte Carlo algorithm for large linear algebra problems. We consider applicability and efficiency of the Markov chain Monte Carlo for large problems, i.e., problems involving matrices with a number of non-zero elements ranging between one million and one billion. We are concentrating on analysis of the almost Optimal Monte Carlo (MAO) algorithm for evaluating bilinear forms of matrix powers since they form the so-called Krylov subspaces. Results are presented comparing the performance of the Robust and Non-robust Monte Carlo algorithms. The algorithms are tested on large dense matrices as well as on large unstructured sparse matrices.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this work we study the computational complexity of a class of grid Monte Carlo algorithms for integral equations. The idea of the algorithms consists in an approximation of the integral equation by a system of algebraic equations. Then the Markov chain iterative Monte Carlo is used to solve the system. The assumption here is that the corresponding Neumann series for the iterative matrix does not necessarily converge or converges slowly. We use a special technique to accelerate the convergence. An estimate of the computational complexity of Monte Carlo algorithm using the considered approach is obtained. The estimate of the complexity is compared with the corresponding quantity for the complexity of the grid-free Monte Carlo algorithm. The conditions under which the class of grid Monte Carlo algorithms is more efficient are given.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We describe a Bayesian approach to analyzing multilocus genotype or haplotype data to assess departures from gametic (linkage) equilibrium. Our approach employs a Markov chain Monte Carlo (MCMC) algorithm to approximate the posterior probability distributions of disequilibrium parameters. The distributions are computed exactly in some simple settings. Among other advantages, posterior distributions can be presented visually, which allows the uncertainties in parameter estimates to be readily assessed. In addition, background knowledge can be incorporated, where available, to improve the precision of inferences. The method is illustrated by application to previously published datasets; implications for multilocus forensic match probabilities and for simple association-based gene mapping are also discussed.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We propose and analyse a class of evolving network models suitable for describing a dynamic topological structure. Applications include telecommunication, on-line social behaviour and information processing in neuroscience. We model the evolving network as a discrete time Markov chain, and study a very general framework where, conditioned on the current state, edges appear or disappear independently at the next timestep. We show how to exploit symmetries in the microscopic, localized rules in order to obtain conjugate classes of random graphs that simplify analysis and calibration of a model. Further, we develop a mean field theory for describing network evolution. For a simple but realistic scenario incorporating the triadic closure effect that has been empirically observed by social scientists (friends of friends tend to become friends), the mean field theory predicts bistable dynamics, and computational results confirm this prediction. We also discuss the calibration issue for a set of real cell phone data, and find support for a stratified model, where individuals are assigned to one of two distinct groups having different within-group and across-group dynamics.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The Stochastic Diffusion Search algorithm -an integral part of Stochastic Search Networks is investigated. Stochastic Diffusion Search is an alternative solution for invariant pattern recognition and focus of attention. It has been shown that the algorithm can be modelled as an ergodic, finite state Markov Chain under some non-restrictive assumptions. Sub-linear time complexity for some settings of parameters has been formulated and proved. Some properties of the algorithm are then characterised and numerical examples illustrating some features of the algorithm are presented.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Statistical methods of inference typically require the likelihood function to be computable in a reasonable amount of time. The class of “likelihood-free” methods termed Approximate Bayesian Computation (ABC) is able to eliminate this requirement, replacing the evaluation of the likelihood with simulation from it. Likelihood-free methods have gained in efficiency and popularity in the past few years, following their integration with Markov Chain Monte Carlo (MCMC) and Sequential Monte Carlo (SMC) in order to better explore the parameter space. They have been applied primarily to estimating the parameters of a given model, but can also be used to compare models. Here we present novel likelihood-free approaches to model comparison, based upon the independent estimation of the evidence of each model under study. Key advantages of these approaches over previous techniques are that they allow the exploitation of MCMC or SMC algorithms for exploring the parameter space, and that they do not require a sampler able to mix between models. We validate the proposed methods using a simple exponential family problem before providing a realistic problem from human population genetics: the comparison of different demographic models based upon genetic data from the Y chromosome.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We present, pedagogically, the Bayesian approach to composed error models under alternative, hierarchical characterizations; demonstrate, briefly, the Bayesian approach to model comparison using recent advances in Markov Chain Monte Carlo (MCMC) methods; and illustrate, empirically, the value of these techniques to natural resource economics and coastal fisheries management, in particular. The Bayesian approach to fisheries efficiency analysis is interesting for at least three reasons. First, it is a robust and highly flexible alternative to commonly applied, frequentist procedures, which dominate the literature. Second,the Bayesian approach is extremely simple to implement, requiring only a modest addition to most natural-resource economist tool-kits. Third, despite its attractions, applications of Bayesian methodology in coastal fisheries management are few.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The steadily accumulating literature on technical efficiency in fisheries attests to the importance of efficiency as an indicator of fleet condition and as an object of management concern. In this paper, we extend previous work by presenting a Bayesian hierarchical approach that yields both efficiency estimates and, as a byproduct of the estimation algorithm, probabilistic rankings of the relative technical efficiencies of fishing boats. The estimation algorithm is based on recent advances in Markov Chain Monte Carlo (MCMC) methods— Gibbs sampling, in particular—which have not been widely used in fisheries economics. We apply the method to a sample of 10,865 boat trips in the US Pacific hake (or whiting) fishery during 1987–2003. We uncover systematic differences between efficiency rankings based on sample mean efficiency estimates and those that exploit the full posterior distributions of boat efficiencies to estimate the probability that a given boat has the highest true mean efficiency.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The Homeric epics are among the greatest masterpieces of literature, but when they were produced is not known with certainty. Here we apply evolutionary-linguistic phylogenetic statistical methods to differences in Homeric, Modern Greek and ancient Hittite vocabulary items to estimate a date of approximately 710–760 BCE for these great works. Our analysis compared a common set of vocabulary items among the three pairs of languages, recording for each item whether the words in the two languages were cognate – derived from a shared ancestral word – or not. We then used a likelihood-based Markov chain Monte Carlo procedure to estimate the most probable times in years separating these languages given the percentage of words they shared, combined with knowledge of the rates at which different words change. Our date for the epics is in close agreement with historians' and classicists' beliefs derived from historical and archaeological sources.