968 resultados para Méthodes de Monte Carlo par chaînes de Markov
Resumo:
En radiothérapie, la tomodensitométrie (CT) fournit l’information anatomique du patient utile au calcul de dose durant la planification de traitement. Afin de considérer la composition hétérogène des tissus, des techniques de calcul telles que la méthode Monte Carlo sont nécessaires pour calculer la dose de manière exacte. L’importation des images CT dans un tel calcul exige que chaque voxel exprimé en unité Hounsfield (HU) soit converti en une valeur physique telle que la densité électronique (ED). Cette conversion est habituellement effectuée à l’aide d’une courbe d’étalonnage HU-ED. Une anomalie ou artefact qui apparaît dans une image CT avant l’étalonnage est susceptible d’assigner un mauvais tissu à un voxel. Ces erreurs peuvent causer une perte cruciale de fiabilité du calcul de dose. Ce travail vise à attribuer une valeur exacte aux voxels d’images CT afin d’assurer la fiabilité des calculs de dose durant la planification de traitement en radiothérapie. Pour y parvenir, une étude est réalisée sur les artefacts qui sont reproduits par simulation Monte Carlo. Pour réduire le temps de calcul, les simulations sont parallélisées et transposées sur un superordinateur. Une étude de sensibilité des nombres HU en présence d’artefacts est ensuite réalisée par une analyse statistique des histogrammes. À l’origine de nombreux artefacts, le durcissement de faisceau est étudié davantage. Une revue sur l’état de l’art en matière de correction du durcissement de faisceau est présentée suivi d’une démonstration explicite d’une correction empirique.
Resumo:
Les processus Markoviens continus en temps sont largement utilisés pour tenter d’expliquer l’évolution des séquences protéiques et nucléotidiques le long des phylogénies. Des modèles probabilistes reposant sur de telles hypothèses sont conçus pour satisfaire la non-homogénéité spatiale des contraintes fonctionnelles et environnementales agissant sur celles-ci. Récemment, des modèles Markov-modulés ont été introduits pour décrire les changements temporels dans les taux d’évolution site-spécifiques (hétérotachie). Des études ont d’autre part démontré que non seulement la force mais également la nature de la contrainte sélective agissant sur un site peut varier à travers le temps. Ici nous proposons de prendre en charge cette réalité évolutive avec un modèle Markov-modulé pour les protéines sous lequel les sites sont autorisés à modifier leurs préférences en acides aminés au cours du temps. L’estimation a posteriori des différents paramètres modulants du noyau stochastique avec les méthodes de Monte Carlo est un défi de taille que nous avons su relever partiellement grâce à la programmation parallèle. Des réglages computationnels sont par ailleurs envisagés pour accélérer la convergence vers l’optimum global de ce paysage multidimensionnel relativement complexe. Qualitativement, notre modèle semble être capable de saisir des signaux d’hétérogénéité temporelle à partir d’un jeu de données dont l’histoire évolutive est reconnue pour être riche en changements de régimes substitutionnels. Des tests de performance suggèrent de plus qu’il serait mieux ajusté aux données qu’un modèle équivalent homogène en temps. Néanmoins, les histoires substitutionnelles tirées de la distribution postérieure sont bruitées et restent difficilement interprétables du point de vue biologique.
Resumo:
The rate at which a given site in a gene sequence alignment evolves over time may vary. This phenomenon-known as heterotachy-can bias or distort phylogenetic trees inferred from models of sequence evolution that assume rates of evolution are constant. Here, we describe a phylogenetic mixture model designed to accommodate heterotachy. The method sums the likelihood of the data at each site over more than one set of branch lengths on the same tree topology. A branch-length set that is best for one site may differ from the branch-length set that is best for some other site, thereby allowing different sites to have different rates of change throughout the tree. Because rate variation may not be present in all branches, we use a reversible-jump Markov chain Monte Carlo algorithm to identify those branches in which reliable amounts of heterotachy occur. We implement the method in combination with our 'pattern-heterogeneity' mixture model, applying it to simulated data and five published datasets. We find that complex evolutionary signals of heterotachy are routinely present over and above variation in the rate or pattern of evolution across sites, that the reversible-jump method requires far fewer parameters than conventional mixture models to describe it, and serves to identify the regions of the tree in which heterotachy is most pronounced. The reversible-jump procedure also removes the need for a posteriori tests of 'significance' such as the Akaike or Bayesian information criterion tests, or Bayes factors. Heterotachy has important consequences for the correct reconstruction of phylogenies as well as for tests of hypotheses that rely on accurate branch-length information. These include molecular clocks, analyses of tempo and mode of evolution, comparative studies and ancestral state reconstruction. The model is available from the authors' website, and can be used for the analysis of both nucleotide and morphological data.
Resumo:
Varroa destructor is a parasitic mite of the Eastern honeybee Apis cerana. Fifty years ago, two distinct evolutionary lineages (Korean and Japanese) invaded the Western honeybee Apis mellifera. This haplo-diploid parasite species reproduces mainly through brother sister matings, a system which largely favors the fixation of new mutations. In a worldwide sample of 225 individuals from 21 locations collected on Western honeybees and analyzed at 19 microsatellite loci, a series of de novo mutations was observed. Using historical data concerning the invasion, this original biological system has been exploited to compare three mutation models with allele size constraints for microsatellite markers: stepwise (SMM) and generalized (GSM) mutation models, and a model with mutation rate increasing exponentially with microsatellite length (ESM). Posterior probabilities of the three models have been estimated for each locus individually using reversible jump Markov Chain Monte Carlo. The relative support of each model varies widely among loci, but the GSM is the only model that always receives at least 9% support, whatever the locus. The analysis also provides robust estimates of mutation parameters for each locus and of the divergence time of the two invasive lineages (67,000 generations with a 90% credibility interval of 35,000-174,000). With an average of 10 generations per year, this divergence time fits with the last post-glacial Korea Japan land separation. (c) 2005 Elsevier Inc. All rights reserved.
Resumo:
We describe a Bayesian method for investigating correlated evolution of discrete binary traits on phylogenetic trees. The method fits a continuous-time Markov model to a pair of traits, seeking the best fitting models that describe their joint evolution on a phylogeny. We employ the methodology of reversible-jump ( RJ) Markov chain Monte Carlo to search among the large number of possible models, some of which conform to independent evolution of the two traits, others to correlated evolution. The RJ Markov chain visits these models in proportion to their posterior probabilities, thereby directly estimating the support for the hypothesis of correlated evolution. In addition, the RJ Markov chain simultaneously estimates the posterior distributions of the rate parameters of the model of trait evolution. These posterior distributions can be used to test among alternative evolutionary scenarios to explain the observed data. All results are integrated over a sample of phylogenetic trees to account for phylogenetic uncertainty. We implement the method in a program called RJ Discrete and illustrate it by analyzing the question of whether mating system and advertisement of estrus by females have coevolved in the Old World monkeys and great apes.
Resumo:
This article presents a statistical method for detecting recombination in DNA sequence alignments, which is based on combining two probabilistic graphical models: (1) a taxon graph (phylogenetic tree) representing the relationship between the taxa, and (2) a site graph (hidden Markov model) representing interactions between different sites in the DNA sequence alignments. We adopt a Bayesian approach and sample the parameters of the model from the posterior distribution with Markov chain Monte Carlo, using a Metropolis-Hastings and Gibbs-within-Gibbs scheme. The proposed method is tested on various synthetic and real-world DNA sequence alignments, and we compare its performance with the established detection methods RECPARS, PLATO, and TOPAL, as well as with two alternative parameter estimation schemes.
Resumo:
In this paper we analyse applicability and robustness of Markov chain Monte Carlo algorithms for eigenvalue problems. We restrict our consideration to real symmetric matrices. Almost Optimal Monte Carlo (MAO) algorithms for solving eigenvalue problems are formulated. Results for the structure of both - systematic and probability error are presented. It is shown that the values of both errors can be controlled independently by different algorithmic parameters. The results present how the systematic error depends on the matrix spectrum. The analysis of the probability error is presented. It shows that the close (in some sense) the matrix under consideration is to the stochastic matrix the smaller is this error. Sufficient conditions for constructing robust and interpolation Monte Carlo algorithms are obtained. For stochastic matrices an interpolation Monte Carlo algorithm is constructed. A number of numerical tests for large symmetric dense matrices are performed in order to study experimentally the dependence of the systematic error from the structure of matrix spectrum. We also study how the probability error depends on the balancing of the matrix. (c) 2007 Elsevier Inc. All rights reserved.
Resumo:
Many well-established statistical methods in genetics were developed in a climate of severe constraints on computational power. Recent advances in simulation methodology now bring modern, flexible statistical methods within the reach of scientists having access to a desktop workstation. We illustrate the potential advantages now available by considering the problem of assessing departures from Hardy-Weinberg (HW) equilibrium. Several hypothesis tests of HW have been established, as well as a variety of point estimation methods for the parameter which measures departures from HW under the inbreeding model. We propose a computational, Bayesian method for assessing departures from HW, which has a number of important advantages over existing approaches. The method incorporates the effects-of uncertainty about the nuisance parameters--the allele frequencies--as well as the boundary constraints on f (which are functions of the nuisance parameters). Results are naturally presented visually, exploiting the graphics capabilities of modern computer environments to allow straightforward interpretation. Perhaps most importantly, the method is founded on a flexible, likelihood-based modelling framework, which can incorporate the inbreeding model if appropriate, but also allows the assumptions of the model to he investigated and, if necessary, relaxed. Under appropriate conditions, information can be shared across loci and, possibly, across populations, leading to more precise estimation. The advantages of the method are illustrated by application both to simulated data and to data analysed by alternative methods in the recent literature.
Resumo:
Monte Carlo algorithms often aim to draw from a distribution π by simulating a Markov chain with transition kernel P such that π is invariant under P. However, there are many situations for which it is impractical or impossible to draw from the transition kernel P. For instance, this is the case with massive datasets, where is it prohibitively expensive to calculate the likelihood and is also the case for intractable likelihood models arising from, for example, Gibbs random fields, such as those found in spatial statistics and network analysis. A natural approach in these cases is to replace P by an approximation Pˆ. Using theory from the stability of Markov chains we explore a variety of situations where it is possible to quantify how ’close’ the chain given by the transition kernel Pˆ is to the chain given by P . We apply these results to several examples from spatial statistics and network analysis.
Resumo:
Doctorado en Análisis Económico. Programa en Análisis Económico Aplicado
Resumo:
Latent class regression models are useful tools for assessing associations between covariates and latent variables. However, evaluation of key model assumptions cannot be performed using methods from standard regression models due to the unobserved nature of latent outcome variables. This paper presents graphical diagnostic tools to evaluate whether or not latent class regression models adhere to standard assumptions of the model: conditional independence and non-differential measurement. An integral part of these methods is the use of a Markov Chain Monte Carlo estimation procedure. Unlike standard maximum likelihood implementations for latent class regression model estimation, the MCMC approach allows us to calculate posterior distributions and point estimates of any functions of parameters. It is this convenience that allows us to provide the diagnostic methods that we introduce. As a motivating example we present an analysis focusing on the association between depression and socioeconomic status, using data from the Epidemiologic Catchment Area study. We consider a latent class regression analysis investigating the association between depression and socioeconomic status measures, where the latent variable depression is regressed on education and income indicators, in addition to age, gender, and marital status variables. While the fitted latent class regression model yields interesting results, the model parameters are found to be invalid due to the violation of model assumptions. The violation of these assumptions is clearly identified by the presented diagnostic plots. These methods can be applied to standard latent class and latent class regression models, and the general principle can be extended to evaluate model assumptions in other types of models.
Resumo:
Markov chain Monte Carlo is a method of producing a correlated sample in order to estimate features of a complicated target distribution via simple ergodic averages. A fundamental question in MCMC applications is when should the sampling stop? That is, when are the ergodic averages good estimates of the desired quantities? We consider a method that stops the MCMC sampling the first time the width of a confidence interval based on the ergodic averages is less than a user-specified value. Hence calculating Monte Carlo standard errors is a critical step in assessing the output of the simulation. In particular, we consider the regenerative simulation and batch means methods of estimating the variance of the asymptotic normal distribution. We describe sufficient conditions for the strong consistency and asymptotic normality of both methods and investigate their finite sample properties in a variety of examples.
Resumo:
Permutation tests are useful for drawing inferences from imaging data because of their flexibility and ability to capture features of the brain that are difficult to capture parametrically. However, most implementations of permutation tests ignore important confounding covariates. To employ covariate control in a nonparametric setting we have developed a Markov chain Monte Carlo (MCMC) algorithm for conditional permutation testing using propensity scores. We present the first use of this methodology for imaging data. Our MCMC algorithm is an extension of algorithms developed to approximate exact conditional probabilities in contingency tables, logit, and log-linear models. An application of our non-parametric method to remove potential bias due to the observed covariates is presented.