961 resultados para Markov chain Monte Carlo methods
Resumo:
We define a copula process which describes the dependencies between arbitrarily many random variables independently of their marginal distributions. As an example, we develop a stochastic volatility model, Gaussian Copula Process Volatility (GCPV), to predict the latent standard deviations of a sequence of random variables. To make predictions we use Bayesian inference, with the Laplace approximation, and with Markov chain Monte Carlo as an alternative. We find both methods comparable. We also find our model can outperform GARCH on simulated and financial data. And unlike GARCH, GCPV can easily handle missing data, incorporate covariates other than time, and model a rich class of covariance structures.
Resumo:
In this paper, an introduction to Bayesian methods in signal processing will be given. The paper starts by considering the important issues of model selection and parameter estimation and derives analytic expressions for the model probabilities of two simple models. The idea of marginal estimation of certain model parameter is then introduced and expressions are derived for the marginal probability densities for frequencies in white Gaussian noise and a Bayesian approach to general changepoint analysis is given. Numerical integration methods are introduced based on Markov chain Monte Carlo techniques and the Gibbs sampler in particular.
Resumo:
In this paper, an introduction to Bayesian methods in signal processing will be given. The paper starts by considering the important issues of model selection and parameter estimation and derives analytic expressions for the model probabilities of two simple models. The idea of marginal estimation of certain model parameter is then introduced and expressions are derived for the marginal probabilitiy densities for frequencies in white Gaussian noise and a Bayesian approach to general changepoint analysis is given. Numerical integration methods are introduced based on Markov chain Monte Carlo techniques and the Gibbs sampler in particular.
Resumo:
The application of Bayes' Theorem to signal processing provides a consistent framework for proceeding from prior knowledge to a posterior inference conditioned on both the prior knowledge and the observed signal data. The first part of the lecture will illustrate how the Bayesian methodology can be applied to a variety of signal processing problems. The second part of the lecture will introduce the concept of Markov Chain Monte-Carlo (MCMC) methods which is an effective approach to overcoming many of the analytical and computational problems inherent in statistical inference. Such techniques are at the centre of the rapidly developing area of Bayesian signal processing which, with the continual increase in available computational power, is likely to provide the underlying framework for most signal processing applications.
Resumo:
In this paper we present Poisson sum series representations for α-stable (αS) random variables and a-stable processes, in particular concentrating on continuous-time autoregressive (CAR) models driven by α-stable Lévy processes. Our representations aim to provide a conditionally Gaussian framework, which will allow parameter estimation using Rao-Blackwellised versions of state of the art Bayesian computational methods such as particle filters and Markov chain Monte Carlo (MCMC). To overcome the issues due to truncation of the series, novel residual approximations are developed. Simulations demonstrate the potential of these Poisson sum representations for inference in otherwise intractable α-stable models. © 2011 IEEE.
Resumo:
We present a novel framework for identifying and tracking dominant agents in groups. Our proposed approach relies on a causality detection scheme that is capable of ranking agents with respect to their contribution in shaping the system's collective behaviour based exclusively on the agents' observed trajectories. Further, the reasoning paradigm is made robust to multiple emissions and clutter by employing a class of recently introduced Markov chain Monte Carlo-based group tracking methods. Examples are provided that demonstrate the strong potential of the proposed scheme in identifying actual leaders in swarms of interacting agents and moving crowds. © 2011 IEEE.
Resumo:
We consider the inverse reinforcement learning problem, that is, the problem of learning from, and then predicting or mimicking a controller based on state/action data. We propose a statistical model for such data, derived from the structure of a Markov decision process. Adopting a Bayesian approach to inference, we show how latent variables of the model can be estimated, and how predictions about actions can be made, in a unified framework. A new Markov chain Monte Carlo (MCMC) sampler is devised for simulation from the posterior distribution. This step includes a parameter expansion step, which is shown to be essential for good convergence properties of the MCMC sampler. As an illustration, the method is applied to learning a human controller.
Resumo:
Partial occlusions are commonplace in a variety of real world computer vision applications: surveillance, intelligent environments, assistive robotics, autonomous navigation, etc. While occlusion handling methods have been proposed, most methods tend to break down when confronted with numerous occluders in a scene. In this paper, a layered image-plane representation for tracking people through substantial occlusions is proposed. An image-plane representation of motion around an object is associated with a pre-computed graphical model, which can be instantiated efficiently during online tracking. A global state and observation space is obtained by linking transitions between layers. A Reversible Jump Markov Chain Monte Carlo approach is used to infer the number of people and track them online. The method outperforms two state-of-the-art methods for tracking over extended occlusions, given videos of a parking lot with numerous vehicles and a laboratory with many desks and workstations.
Resumo:
We consider the problem of variable selection in regression modeling in high-dimensional spaces where there is known structure among the covariates. This is an unconventional variable selection problem for two reasons: (1) The dimension of the covariate space is comparable, and often much larger, than the number of subjects in the study, and (2) the covariate space is highly structured, and in some cases it is desirable to incorporate this structural information in to the model building process. We approach this problem through the Bayesian variable selection framework, where we assume that the covariates lie on an undirected graph and formulate an Ising prior on the model space for incorporating structural information. Certain computational and statistical problems arise that are unique to such high-dimensional, structured settings, the most interesting being the phenomenon of phase transitions. We propose theoretical and computational schemes to mitigate these problems. We illustrate our methods on two different graph structures: the linear chain and the regular graph of degree k. Finally, we use our methods to study a specific application in genomics: the modeling of transcription factor binding sites in DNA sequences. © 2010 American Statistical Association.
Resumo:
Transcriptional regulation has been studied intensively in recent decades. One important aspect of this regulation is the interaction between regulatory proteins, such as transcription factors (TF) and nucleosomes, and the genome. Different high-throughput techniques have been invented to map these interactions genome-wide, including ChIP-based methods (ChIP-chip, ChIP-seq, etc.), nuclease digestion methods (DNase-seq, MNase-seq, etc.), and others. However, a single experimental technique often only provides partial and noisy information about the whole picture of protein-DNA interactions. Therefore, the overarching goal of this dissertation is to provide computational developments for jointly modeling different experimental datasets to achieve a holistic inference on the protein-DNA interaction landscape.
We first present a computational framework that can incorporate the protein binding information in MNase-seq data into a thermodynamic model of protein-DNA interaction. We use a correlation-based objective function to model the MNase-seq data and a Markov chain Monte Carlo method to maximize the function. Our results show that the inferred protein-DNA interaction landscape is concordant with the MNase-seq data and provides a mechanistic explanation for the experimentally collected MNase-seq fragments. Our framework is flexible and can easily incorporate other data sources. To demonstrate this flexibility, we use prior distributions to integrate experimentally measured protein concentrations.
We also study the ability of DNase-seq data to position nucleosomes. Traditionally, DNase-seq has only been widely used to identify DNase hypersensitive sites, which tend to be open chromatin regulatory regions devoid of nucleosomes. We reveal for the first time that DNase-seq datasets also contain substantial information about nucleosome translational positioning, and that existing DNase-seq data can be used to infer nucleosome positions with high accuracy. We develop a Bayes-factor-based nucleosome scoring method to position nucleosomes using DNase-seq data. Our approach utilizes several effective strategies to extract nucleosome positioning signals from the noisy DNase-seq data, including jointly modeling data points across the nucleosome body and explicitly modeling the quadratic and oscillatory DNase I digestion pattern on nucleosomes. We show that our DNase-seq-based nucleosome map is highly consistent with previous high-resolution maps. We also show that the oscillatory DNase I digestion pattern is useful in revealing the nucleosome rotational context around TF binding sites.
Finally, we present a state-space model (SSM) for jointly modeling different kinds of genomic data to provide an accurate view of the protein-DNA interaction landscape. We also provide an efficient expectation-maximization algorithm to learn model parameters from data. We first show in simulation studies that the SSM can effectively recover underlying true protein binding configurations. We then apply the SSM to model real genomic data (both DNase-seq and MNase-seq data). Through incrementally increasing the types of genomic data in the SSM, we show that different data types can contribute complementary information for the inference of protein binding landscape and that the most accurate inference comes from modeling all available datasets.
This dissertation provides a foundation for future research by taking a step toward the genome-wide inference of protein-DNA interaction landscape through data integration.
Resumo:
We propose a novel unsupervised approach for linking records across arbitrarily many files, while simultaneously detecting duplicate records within files. Our key innovation is to represent the pattern of links between records as a {\em bipartite} graph, in which records are directly linked to latent true individuals, and only indirectly linked to other records. This flexible new representation of the linkage structure naturally allows us to estimate the attributes of the unique observable people in the population, calculate $k$-way posterior probabilities of matches across records, and propagate the uncertainty of record linkage into later analyses. Our linkage structure lends itself to an efficient, linear-time, hybrid Markov chain Monte Carlo algorithm, which overcomes many obstacles encountered by previously proposed methods of record linkage, despite the high dimensional parameter space. We assess our results on real and simulated data.
Neutron quasi-elastic scattering in disordered solids: a Monte Carlo study of metal-hydrogen systems
Resumo:
The dynamic structure factor of neutron quasi-elastic scattering has been calculated by Monte Carlo methods for atoms diffusing on a disordered lattice. The disorder includes not only variation in the distances between neighbouring atomic sites but also variation in the hopping rate associated with each site. The presence of the disorder, particularly the hopping rate disorder, causes changes in the time-dependent intermediate scattering function which translate into a significant increase in the intensity in the wings of the quasi-elastic spectrum as compared with the Lorentzian form. The effect is particularly marked at high values of the momentum transfer and at site occupancies of the order of unity. The MC calculations demonstrate how the degree of disorder may be derived from experimental measurements of the quasi-elastic scattering. The model structure factors are compared with the experimental quasi-elastic spectrum of an amorphous metal-hydrogen alloy.
Resumo:
The rotating-frame nuclear magnetic relaxation rate of spins diffusing on a disordered lattice has been calculated by Monte Carlo methods. The disorder includes not only variation in the distances between neighbouring spin sites but also variation in the hopping rate associated with each site. The presence of the disorder, particularly the hopping rate disorder, causes changes in the time-dependent spin correlation functions which translate into asymmetry in the characteristic peak in the temperature dependence of the dipolar relaxation rate. The results may be used to deduce the average hopping rate from the relaxation but the effect is not sufficiently marked to enable the distribution of the hopping rates to be evaluated. The distribution, which is a measure of the degree of disorder, is the more interesting feature and it has been possible to show from the calculation that measurements of the relaxation rate as a function of the strength of the radiofrequency spin-locking magnetic field can lead to an evaluation of its width. Some experimental data on an amorphous metal - hydrogen alloy are reported which demonstrate the feasibility of this novel approach to rotating-frame relaxation in disordered materials.
Resumo:
Raised bog peat deposits form important archives for reconstructing past changes in climate. Precise and reliable age models are of vital importance for interpreting such archives. We propose enhanced, Markov chain Monte Carlo based methods for obtaining age models from radiocarbon-dated peat cores, based on the assumption of piecewise linear accumulation. Included are automatic choice of sections, a measure of the goodness of fit and outlier downweighting. The approach is illustrated by using a peat core from the Netherlands.
Resumo:
Context. Several competing scenarios for planetary-system formation and evolution seek to explain how hot Jupiters came to be so close to their parent stars. Most planetary parameters evolve with time, making it hard to distinguish between models. The obliquity of an orbit with respect to the stellar rotation axis is thought to be more stable than other parameters such as eccentricity. Most planets, to date, appear aligned with the stellar rotation axis; the few misaligned planets so far detected are massive (> 2 MJ). Aims: Our goal is to measure the degree of alignment between planetary orbits and stellar spin axes, to search for potential correlations with eccentricity or other planetary parameters and to measure long term radial velocity variability indicating the presence of other bodies in the system. Methods: For transiting planets, the Rossiter-McLaughlin effect allows the measurement of the sky-projected angle ß between the stellar rotation axis and a planet's orbital axis. Using the HARPS spectrograph, we observed the Rossiter-McLaughlin effect for six transiting hot Jupiters found by the WASP consortium. We combine these with long term radial velocity measurements obtained with CORALIE. We used a combined analysis of photometry and radial velocities, fitting model parameters with the Markov Chain Monte Carlo method. After obtaining ß we attempt to statistically determine the distribution of the real spin-orbit angle ?. Results: We found that three of our targets have ß above 90°: WASP-2b: ß = 153°+11-15, WASP-15b: ß = 139.6°+5.2-4.3 and WASP-17b: ß = 148.5°+5.1-4.2; the other three (WASP-4b, WASP-5b and WASP-18b) have angles compatible with 0°. We find no dependence between the misaligned angle and planet mass nor with any other planetary parameter. All six orbits are close to circular, with only one firm detection of eccentricity e = 0.00848+0.00085-0.00095 in WASP-18b. No long-term radial acceleration was detected for any of the targets. Combining all previous 20 measurements of ß and our six and transforming them into a distribution of ? we find that between about 45 and 85% of hot Jupiters have ? > 30°. Conclusions: Most hot Jupiters are misaligned, with a large variety of spin-orbit angles. We find observations and predictions using the Kozai mechanism match well. If these observational facts are confirmed in the future, we may then conclude that most hot Jupiters are formed from a dynamical and tidal origin without the necessity to use type I or II migration. At present, standard disc migration cannot explain the observations without invoking at least another additional process.