52 resultados para Continuous-time Markov Chain
em CentAUR: Central Archive University of Reading - UK
Resumo:
We describe a Bayesian method for investigating correlated evolution of discrete binary traits on phylogenetic trees. The method fits a continuous-time Markov model to a pair of traits, seeking the best fitting models that describe their joint evolution on a phylogeny. We employ the methodology of reversible-jump ( RJ) Markov chain Monte Carlo to search among the large number of possible models, some of which conform to independent evolution of the two traits, others to correlated evolution. The RJ Markov chain visits these models in proportion to their posterior probabilities, thereby directly estimating the support for the hypothesis of correlated evolution. In addition, the RJ Markov chain simultaneously estimates the posterior distributions of the rate parameters of the model of trait evolution. These posterior distributions can be used to test among alternative evolutionary scenarios to explain the observed data. All results are integrated over a sample of phylogenetic trees to account for phylogenetic uncertainty. We implement the method in a program called RJ Discrete and illustrate it by analyzing the question of whether mating system and advertisement of estrus by females have coevolved in the Old World monkeys and great apes.
Resumo:
The rate at which a given site in a gene sequence alignment evolves over time may vary. This phenomenon-known as heterotachy-can bias or distort phylogenetic trees inferred from models of sequence evolution that assume rates of evolution are constant. Here, we describe a phylogenetic mixture model designed to accommodate heterotachy. The method sums the likelihood of the data at each site over more than one set of branch lengths on the same tree topology. A branch-length set that is best for one site may differ from the branch-length set that is best for some other site, thereby allowing different sites to have different rates of change throughout the tree. Because rate variation may not be present in all branches, we use a reversible-jump Markov chain Monte Carlo algorithm to identify those branches in which reliable amounts of heterotachy occur. We implement the method in combination with our 'pattern-heterogeneity' mixture model, applying it to simulated data and five published datasets. We find that complex evolutionary signals of heterotachy are routinely present over and above variation in the rate or pattern of evolution across sites, that the reversible-jump method requires far fewer parameters than conventional mixture models to describe it, and serves to identify the regions of the tree in which heterotachy is most pronounced. The reversible-jump procedure also removes the need for a posteriori tests of 'significance' such as the Akaike or Bayesian information criterion tests, or Bayes factors. Heterotachy has important consequences for the correct reconstruction of phylogenies as well as for tests of hypotheses that rely on accurate branch-length information. These include molecular clocks, analyses of tempo and mode of evolution, comparative studies and ancestral state reconstruction. The model is available from the authors' website, and can be used for the analysis of both nucleotide and morphological data.
Resumo:
Varroa destructor is a parasitic mite of the Eastern honeybee Apis cerana. Fifty years ago, two distinct evolutionary lineages (Korean and Japanese) invaded the Western honeybee Apis mellifera. This haplo-diploid parasite species reproduces mainly through brother sister matings, a system which largely favors the fixation of new mutations. In a worldwide sample of 225 individuals from 21 locations collected on Western honeybees and analyzed at 19 microsatellite loci, a series of de novo mutations was observed. Using historical data concerning the invasion, this original biological system has been exploited to compare three mutation models with allele size constraints for microsatellite markers: stepwise (SMM) and generalized (GSM) mutation models, and a model with mutation rate increasing exponentially with microsatellite length (ESM). Posterior probabilities of the three models have been estimated for each locus individually using reversible jump Markov Chain Monte Carlo. The relative support of each model varies widely among loci, but the GSM is the only model that always receives at least 9% support, whatever the locus. The analysis also provides robust estimates of mutation parameters for each locus and of the divergence time of the two invasive lineages (67,000 generations with a 90% credibility interval of 35,000-174,000). With an average of 10 generations per year, this divergence time fits with the last post-glacial Korea Japan land separation. (c) 2005 Elsevier Inc. All rights reserved.
Resumo:
This article presents a statistical method for detecting recombination in DNA sequence alignments, which is based on combining two probabilistic graphical models: (1) a taxon graph (phylogenetic tree) representing the relationship between the taxa, and (2) a site graph (hidden Markov model) representing interactions between different sites in the DNA sequence alignments. We adopt a Bayesian approach and sample the parameters of the model from the posterior distribution with Markov chain Monte Carlo, using a Metropolis-Hastings and Gibbs-within-Gibbs scheme. The proposed method is tested on various synthetic and real-world DNA sequence alignments, and we compare its performance with the established detection methods RECPARS, PLATO, and TOPAL, as well as with two alternative parameter estimation schemes.
Resumo:
In this paper we analyse applicability and robustness of Markov chain Monte Carlo algorithms for eigenvalue problems. We restrict our consideration to real symmetric matrices. Almost Optimal Monte Carlo (MAO) algorithms for solving eigenvalue problems are formulated. Results for the structure of both - systematic and probability error are presented. It is shown that the values of both errors can be controlled independently by different algorithmic parameters. The results present how the systematic error depends on the matrix spectrum. The analysis of the probability error is presented. It shows that the close (in some sense) the matrix under consideration is to the stochastic matrix the smaller is this error. Sufficient conditions for constructing robust and interpolation Monte Carlo algorithms are obtained. For stochastic matrices an interpolation Monte Carlo algorithm is constructed. A number of numerical tests for large symmetric dense matrices are performed in order to study experimentally the dependence of the systematic error from the structure of matrix spectrum. We also study how the probability error depends on the balancing of the matrix. (c) 2007 Elsevier Inc. All rights reserved.
Resumo:
Many well-established statistical methods in genetics were developed in a climate of severe constraints on computational power. Recent advances in simulation methodology now bring modern, flexible statistical methods within the reach of scientists having access to a desktop workstation. We illustrate the potential advantages now available by considering the problem of assessing departures from Hardy-Weinberg (HW) equilibrium. Several hypothesis tests of HW have been established, as well as a variety of point estimation methods for the parameter which measures departures from HW under the inbreeding model. We propose a computational, Bayesian method for assessing departures from HW, which has a number of important advantages over existing approaches. The method incorporates the effects-of uncertainty about the nuisance parameters--the allele frequencies--as well as the boundary constraints on f (which are functions of the nuisance parameters). Results are naturally presented visually, exploiting the graphics capabilities of modern computer environments to allow straightforward interpretation. Perhaps most importantly, the method is founded on a flexible, likelihood-based modelling framework, which can incorporate the inbreeding model if appropriate, but also allows the assumptions of the model to he investigated and, if necessary, relaxed. Under appropriate conditions, information can be shared across loci and, possibly, across populations, leading to more precise estimation. The advantages of the method are illustrated by application both to simulated data and to data analysed by alternative methods in the recent literature.
Resumo:
This work provides a framework for the approximation of a dynamic system of the form x˙=f(x)+g(x)u by dynamic recurrent neural network. This extends previous work in which approximate realisation of autonomous dynamic systems was proven. Given certain conditions, the first p output neural units of a dynamic n-dimensional neural model approximate at a desired proximity a p-dimensional dynamic system with n>p. The neural architecture studied is then successfully implemented in a nonlinear multivariable system identification case study.
Resumo:
This paper derives exact discrete time representations for data generated by a continuous time autoregressive moving average (ARMA) system with mixed stock and flow data. The representations for systems comprised entirely of stocks or of flows are also given. In each case the discrete time representations are shown to be of ARMA form, the orders depending on those of the continuous time system. Three examples and applications are also provided, two of which concern the stationary ARMA(2, 1) model with stock variables (with applications to sunspot data and a short-term interest rate) and one concerning the nonstationary ARMA(2, 1) model with a flow variable (with an application to U.S. nondurable consumers’ expenditure). In all three examples the presence of an MA(1) component in the continuous time system has a dramatic impact on eradicating unaccounted-for serial correlation that is present in the discrete time version of the ARMA(2, 0) specification, even though the form of the discrete time model is ARMA(2, 1) for both models.
Resumo:
The persistence of investment performance is a topic of perennial interest to investors. Efficient Markets theory tells us that past performance can not be used to predict future performance yet investors appear to be influenced by the historical performance in making their investment allocation decisions. The problem has been of particular interest to investors in real estate; not least because reported returns from investment in real estate are serially correlated thus implying some persistence in investment performance. This paper applies the established approach of Markov Chain analysis to investigate the relationship between past and present performance of UK real estate over the period 1981 to 1996. The data are analysed by sector, region and size. Furthermore some variations in investment performance classification are reported and the results are shown to be robust.
Resumo:
Variational data assimilation in continuous time is revisited. The central techniques applied in this paper are in part adopted from the theory of optimal nonlinear control. Alternatively, the investigated approach can be considered as a continuous time generalization of what is known as weakly constrained four-dimensional variational assimilation (4D-Var) in the geosciences. The technique allows to assimilate trajectories in the case of partial observations and in the presence of model error. Several mathematical aspects of the approach are studied. Computationally, it amounts to solving a two-point boundary value problem. For imperfect models, the trade-off between small dynamical error (i.e. the trajectory obeys the model dynamics) and small observational error (i.e. the trajectory closely follows the observations) is investigated. This trade-off turns out to be trivial if the model is perfect. However, even in this situation, allowing for minute deviations from the perfect model is shown to have positive effects, namely to regularize the problem. The presented formalism is dynamical in character. No statistical assumptions on dynamical or observational noise are imposed.
Resumo:
Data assimilation refers to the problem of finding trajectories of a prescribed dynamical model in such a way that the output of the model (usually some function of the model states) follows a given time series of observations. Typically though, these two requirements cannot both be met at the same time–tracking the observations is not possible without the trajectory deviating from the proposed model equations, while adherence to the model requires deviations from the observations. Thus, data assimilation faces a trade-off. In this contribution, the sensitivity of the data assimilation with respect to perturbations in the observations is identified as the parameter which controls the trade-off. A relation between the sensitivity and the out-of-sample error is established, which allows the latter to be calculated under operational conditions. A minimum out-of-sample error is proposed as a criterion to set an appropriate sensitivity and to settle the discussed trade-off. Two approaches to data assimilation are considered, namely variational data assimilation and Newtonian nudging, also known as synchronization. Numerical examples demonstrate the feasibility of the approach.
Resumo:
We propose and analyse a class of evolving network models suitable for describing a dynamic topological structure. Applications include telecommunication, on-line social behaviour and information processing in neuroscience. We model the evolving network as a discrete time Markov chain, and study a very general framework where, conditioned on the current state, edges appear or disappear independently at the next timestep. We show how to exploit symmetries in the microscopic, localized rules in order to obtain conjugate classes of random graphs that simplify analysis and calibration of a model. Further, we develop a mean field theory for describing network evolution. For a simple but realistic scenario incorporating the triadic closure effect that has been empirically observed by social scientists (friends of friends tend to become friends), the mean field theory predicts bistable dynamics, and computational results confirm this prediction. We also discuss the calibration issue for a set of real cell phone data, and find support for a stratified model, where individuals are assigned to one of two distinct groups having different within-group and across-group dynamics.
Resumo:
The Stochastic Diffusion Search algorithm -an integral part of Stochastic Search Networks is investigated. Stochastic Diffusion Search is an alternative solution for invariant pattern recognition and focus of attention. It has been shown that the algorithm can be modelled as an ergodic, finite state Markov Chain under some non-restrictive assumptions. Sub-linear time complexity for some settings of parameters has been formulated and proved. Some properties of the algorithm are then characterised and numerical examples illustrating some features of the algorithm are presented.
Resumo:
In this paper, we study jumps in commodity prices. Unlike assumed in existing models of commodity price dynamics, a simple analysis of the data reveals that the probability of tail events is not constant but depends on the time of the year, i.e. exhibits seasonality. We propose a stochastic volatility jump–diffusion model to capture this seasonal variation. Applying the Markov Chain Monte Carlo (MCMC) methodology, we estimate our model using 20 years of futures data from four different commodity markets. We find strong statistical evidence to suggest that our model with seasonal jump intensity outperforms models featuring a constant jump intensity. To demonstrate the practical relevance of our findings, we show that our model typically improves Value-at-Risk (VaR) forecasts.