2 resultados para Binomial Distribution
em Duke University
Resumo:
A class of multi-process models is developed for collections of time indexed count data. Autocorrelation in counts is achieved with dynamic models for the natural parameter of the binomial distribution. In addition to modeling binomial time series, the framework includes dynamic models for multinomial and Poisson time series. Markov chain Monte Carlo (MCMC) and Po ́lya-Gamma data augmentation (Polson et al., 2013) are critical for fitting multi-process models of counts. To facilitate computation when the counts are high, a Gaussian approximation to the P ́olya- Gamma random variable is developed.
Three applied analyses are presented to explore the utility and versatility of the framework. The first analysis develops a model for complex dynamic behavior of themes in collections of text documents. Documents are modeled as a “bag of words”, and the multinomial distribution is used to characterize uncertainty in the vocabulary terms appearing in each document. State-space models for the natural parameters of the multinomial distribution induce autocorrelation in themes and their proportional representation in the corpus over time.
The second analysis develops a dynamic mixed membership model for Poisson counts. The model is applied to a collection of time series which record neuron level firing patterns in rhesus monkeys. The monkey is exposed to two sounds simultaneously, and Gaussian processes are used to smoothly model the time-varying rate at which the neuron’s firing pattern fluctuates between features associated with each sound in isolation.
The third analysis presents a switching dynamic generalized linear model for the time-varying home run totals of professional baseball players. The model endows each player with an age specific latent natural ability class and a performance enhancing drug (PED) use indicator. As players age, they randomly transition through a sequence of ability classes in a manner consistent with traditional aging patterns. When the performance of the player significantly deviates from the expected aging pattern, he is identified as a player whose performance is consistent with PED use.
All three models provide a mechanism for sharing information across related series locally in time. The models are fit with variations on the P ́olya-Gamma Gibbs sampler, MCMC convergence diagnostics are developed, and reproducible inference is emphasized throughout the dissertation.
Resumo:
In regression analysis of counts, a lack of simple and efficient algorithms for posterior computation has made Bayesian approaches appear unattractive and thus underdeveloped. We propose a lognormal and gamma mixed negative binomial (NB) regression model for counts, and present efficient closed-form Bayesian inference; unlike conventional Poisson models, the proposed approach has two free parameters to include two different kinds of random effects, and allows the incorporation of prior information, such as sparsity in the regression coefficients. By placing a gamma distribution prior on the NB dispersion parameter r, and connecting a log-normal distribution prior with the logit of the NB probability parameter p, efficient Gibbs sampling and variational Bayes inference are both developed. The closed-form updates are obtained by exploiting conditional conjugacy via both a compound Poisson representation and a Polya-Gamma distribution based data augmentation approach. The proposed Bayesian inference can be implemented routinely, while being easily generalizable to more complex settings involving multivariate dependence structures. The algorithms are illustrated using real examples. Copyright 2012 by the author(s)/owner(s).