919 resultados para Bayesian Markov process
Resumo:
Approximate Bayesian computation (ABC) has become a popular technique to facilitate Bayesian inference from complex models. In this article we present an ABC approximation designed to perform biased filtering for a Hidden Markov Model when the likelihood function is intractable. We use a sequential Monte Carlo (SMC) algorithm to both fit and sample from our ABC approximation of the target probability density. This approach is shown to, empirically, be more accurate w.r.t.~the original filter than competing methods. The theoretical bias of our method is investigated; it is shown that the bias goes to zero at the expense of increased computational effort. Our approach is illustrated on a constrained sequential lasso for portfolio allocation to 15 constituents of the FTSE 100 share index.
Resumo:
This paper analyzes the stationarity of this ratio in the context of a Markov-switching model à la Hamilton (1989) where an asymmetric speed of adjustment is introduced. This particular specification robustly supports a nonlinear reversion process and identifies two relevant episodes: the post-war period from the mid-50’s to the mid-70’s and the so called “90’s boom” period. A three-regime Markov-switching model displays the best regime identification and reveals that only the first part of the 90’s boom (1985-1995) and the post-war period are near-nonstationary states. Interestingly, the last part of the 90’s boom (1996-2000), characterized by a growing price-dividend ratio, is entirely attributed to a regime featuring a highly reverting process.
Resumo:
162 p.
Resumo:
A generalized Bayesian population dynamics model was developed for analysis of historical mark-recapture studies. The Bayesian approach builds upon existing maximum likelihood methods and is useful when substantial uncertainties exist in the data or little information is available about auxiliary parameters such as tag loss and reporting rates. Movement rates are obtained through Markov-chain Monte-Carlo (MCMC) simulation, which are suitable for use as input in subsequent stock assessment analysis. The mark-recapture model was applied to English sole (Parophrys vetulus) off the west coast of the United States and Canada and migration rates were estimated to be 2% per month to the north and 4% per month to the south. These posterior parameter distributions and the Bayesian framework for comparing hypotheses can guide fishery scientists in structuring the spatial and temporal complexity of future analyses of this kind. This approach could be easily generalized for application to other species and more data-rich fishery analyses.
Resumo:
Molecular markers have been demonstrated to be useful for the estimation of stock mixture proportions where the origin of individuals is determined from baseline samples. Bayesian statistical methods are widely recognized as providing a preferable strategy for such analyses. In general, Bayesian estimation is based on standard latent class models using data augmentation through Markov chain Monte Carlo techniques. In this study, we introduce a novel approach based on recent developments in the estimation of genetic population structure. Our strategy combines analytical integration with stochastic optimization to identify stock mixtures. An important enhancement over previous methods is the possibility of appropriately handling data where only partial baseline sample information is available. We address the potential use of nonmolecular, auxiliary biological information in our Bayesian model.
Resumo:
A segmentação dos nomes nas suas partes constitutivas é uma etapa fundamental no processo de integração de bases de dados por meio das técnicas de vinculação de registros. Esta separação dos nomes pode ser realizada de diferentes maneiras. Este estudo teve como objetivo avaliar a utilização do Modelo Escondido de Markov (HMM) na segmentação nomes e endereços de pessoas e a eficiência desta segmentação no processo de vinculação de registros. Foram utilizadas as bases do Sistema de Informações sobre Mortalidade (SIM) e do Subsistema de Informação de Procedimentos de Alta Complexidade (APAC) do estado do Rio de Janeiro no período entre 1999 a 2004. Uma metodologia foi proposta para a segmentação de nome e endereço sendo composta por oito fases, utilizando rotinas implementadas em PL/SQL e a biblioteca JAHMM, implementação na linguagem Java de algoritmos de HMM. Uma amostra aleatória de 100 registros de cada base foi utilizada para verificar a correção do processo de segmentação por meio do modelo HMM.Para verificar o efeito da segmentação do nome por meio do HMM, três processos de vinculação foram aplicados sobre uma amostra das duas bases citadas acima, cada um deles utilizando diferentes estratégias de segmentação, a saber: 1) divisão dos nomes pela primeira parte, última parte e iniciais do nome do meio; 2) divisão do nome em cinco partes; (3) segmentação segundo o HMM. A aplicação do modelo HMM como mecanismo de segmentação obteve boa concordância quando comparado com o observador humano. As diferentes estratégias de segmentação geraram resultados bastante similares na vinculação de registros, tendo a estratégia 1 obtido um desempenho pouco melhor que as demais. Este estudo sugere que a segmentação de nomes brasileiros por meio do modelo escondido de Markov não é mais eficaz do que métodos tradicionais de segmentação.
Resumo:
Este trabalho avalia o comportamento dos multiplicadores fiscais no Brasil entre 1999-2012. Para tanto, utiliza a metodologia desenvolvida por Sims, Waggoner e Zha (2008), que é um procedimento Bayesiano de estimação no qual os parâmetros do modelo mudam com alterações no estado da economia e os estados (regimes) seguem um processo de mudança de regime markoviano. Ou seja, foi estimado um modelo VAR Estrutural Bayesiano com mudança de regimes Markoviana (Markov Switching Structural Bayesian Vector Autoregression - MS-SBVAR). A base de dados é composta pelo consumo da administração pública, pela formação bruta de capital fixo da administração pública, pela carga tributária líquida e pelo Produto Interno Bruto (PIB), das três esferas do governo (federal, estadual, incluindo o Distrito Federal, e municipal). O software MATLAB/Dynare foi utilizado na estimação dos modelos e os resultados sugerem a ocorrência de 2 ou 3 regimes nos dois modelos que melhor se ajustaram aos dados. Os multiplicadores estimados apresentaram os sinais esperados e os diferentes tipos de multiplicadores fiscais calculados apresentaram valores maiores para a resposta do PIB a choques na formação bruta de capital fixo da administração pública que são eficazes, uma vez que possuem valores maiores do que um e impacto de longo prazo no PIB - quando comparado aos choques no consumo da administração pública, que possuem pouca persistência e são ineficazes (menores do que um), além de uma resposta negativa e persistente do PIB a choques na carga tributária líquida. Os resultados obtidos não indicam, ainda, multiplicadores fiscais maiores em regimes com maior variância nos resíduos do modelo.
Resumo:
Atlantic Croaker (Micropogonias undulatus) production dynamics along the U.S. Atlantic coast are regulated by fishing and winter water temperature. Stakeholders for this resource have recommended investigating the effects of climate covariates in assessment models. This study used state-space biomass dynamic models without (model 1) and with (model 2) the minimum winter estuarine temperature (MWET) to examine MWET effects on Atlantic Croaker population dynamics during 1972–2008. In model 2, MWET was introduced into the intrinsic rate of population increase (r). For both models, a prior probability distribution (prior) was constructed for r or a scaling parameter (r0); imputs were the fishery removals, and fall biomass indices developed by using data from the Multispecies Bottom Trawl Survey of the Northeast Fisheries Science Center, National Marine Fisheries Service, and the Coastal Trawl Survey of the Southeast Area Monitoring and Assessment Program. Model sensitivity runs incorporated a uniform (0.01,1.5) prior for r or r0 and bycatch data from the shrimp-trawl fishery. All model variants produced similar results and therefore supported the conclusion of low risk of overfishing for the Atlantic Croaker stock in the 2000s. However, the data statistically supported only model 1 and its configuration that included the shrimp-trawl fishery bycatch. The process errors of these models showed slightly positive and significant correlations with MWET, indicating that warmer winters would enhance Atlantic Croaker biomass production. Inconclusive, somewhat conflicting results indicate that biomass dynamic models should not integrate MWET, pending, perhaps, accumulation of longer time series of the variables controlling the production dynamics of Atlantic Croaker, preferably including winter-induced estimates of Atlantic Croaker kills.
Resumo:
MOTIVATION: We present a method for directly inferring transcriptional modules (TMs) by integrating gene expression and transcription factor binding (ChIP-chip) data. Our model extends a hierarchical Dirichlet process mixture model to allow data fusion on a gene-by-gene basis. This encodes the intuition that co-expression and co-regulation are not necessarily equivalent and hence we do not expect all genes to group similarly in both datasets. In particular, it allows us to identify the subset of genes that share the same structure of transcriptional modules in both datasets. RESULTS: We find that by working on a gene-by-gene basis, our model is able to extract clusters with greater functional coherence than existing methods. By combining gene expression and transcription factor binding (ChIP-chip) data in this way, we are better able to determine the groups of genes that are most likely to represent underlying TMs. AVAILABILITY: If interested in the code for the work presented in this article, please contact the authors. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Resumo:
Understanding the regulatory mechanisms that are responsible for an organism's response to environmental change is an important issue in molecular biology. A first and important step towards this goal is to detect genes whose expression levels are affected by altered external conditions. A range of methods to test for differential gene expression, both in static as well as in time-course experiments, have been proposed. While these tests answer the question whether a gene is differentially expressed, they do not explicitly address the question when a gene is differentially expressed, although this information may provide insights into the course and causal structure of regulatory programs. In this article, we propose a two-sample test for identifying intervals of differential gene expression in microarray time series. Our approach is based on Gaussian process regression, can deal with arbitrary numbers of replicates, and is robust with respect to outliers. We apply our algorithm to study the response of Arabidopsis thaliana genes to an infection by a fungal pathogen using a microarray time series dataset covering 30,336 gene probes at 24 observed time points. In classification experiments, our test compares favorably with existing methods and provides additional insights into time-dependent differential expression.
Resumo:
In this paper, we present two classes of Bayesian approaches to the two-sample problem. Our first class of methods extends the Bayesian t-test to include all parametric models in the exponential family and their conjugate priors. Our second class of methods uses Dirichlet process mixtures (DPM) of such conjugate-exponential distributions as flexible nonparametric priors over the unknown distributions.
Resumo:
Deep belief networks are a powerful way to model complex probability distributions. However, learning the structure of a belief network, particularly one with hidden units, is difficult. The Indian buffet process has been used as a nonparametric Bayesian prior on the directed structure of a belief network with a single infinitely wide hidden layer. In this paper, we introduce the cascading Indian buffet process (CIBP), which provides a nonparametric prior on the structure of a layered, directed belief network that is unbounded in both depth and width, yet allows tractable inference. We use the CIBP prior with the nonlinear Gaussian belief network so each unit can additionally vary its behavior between discrete and continuous representations. We provide Markov chain Monte Carlo algorithms for inference in these belief networks and explore the structures learned on several image data sets.
Resumo:
We define a copula process which describes the dependencies between arbitrarily many random variables independently of their marginal distributions. As an example, we develop a stochastic volatility model, Gaussian Copula Process Volatility (GCPV), to predict the latent standard deviations of a sequence of random variables. To make predictions we use Bayesian inference, with the Laplace approximation, and with Markov chain Monte Carlo as an alternative. We find both methods comparable. We also find our model can outperform GARCH on simulated and financial data. And unlike GARCH, GCPV can easily handle missing data, incorporate covariates other than time, and model a rich class of covariance structures.
Resumo:
A nonparametric Bayesian extension of Factor Analysis (FA) is proposed where observed data $\mathbf{Y}$ is modeled as a linear superposition, $\mathbf{G}$, of a potentially infinite number of hidden factors, $\mathbf{X}$. The Indian Buffet Process (IBP) is used as a prior on $\mathbf{G}$ to incorporate sparsity and to allow the number of latent features to be inferred. The model's utility for modeling gene expression data is investigated using randomly generated data sets based on a known sparse connectivity matrix for E. Coli, and on three biological data sets of increasing complexity.