265 resultados para Jeffreys priors
Resumo:
This paper proposes solutions to three issues pertaining to the estimation of finite mixture models with an unknown number of components: the non-identifiability induced by overfitting the number of components, the mixing limitations of standard Markov Chain Monte Carlo (MCMC) sampling techniques, and the related label switching problem. An overfitting approach is used to estimate the number of components in a finite mixture model via a Zmix algorithm. Zmix provides a bridge between multidimensional samplers and test based estimation methods, whereby priors are chosen to encourage extra groups to have weights approaching zero. MCMC sampling is made possible by the implementation of prior parallel tempering, an extension of parallel tempering. Zmix can accurately estimate the number of components, posterior parameter estimates and allocation probabilities given a sufficiently large sample size. The results will reflect uncertainty in the final model and will report the range of possible candidate models and their respective estimated probabilities from a single run. Label switching is resolved with a computationally light-weight method, Zswitch, developed for overfitted mixtures by exploiting the intuitiveness of allocation-based relabelling algorithms and the precision of label-invariant loss functions. Four simulation studies are included to illustrate Zmix and Zswitch, as well as three case studies from the literature. All methods are available as part of the R package Zmix, which can currently be applied to univariate Gaussian mixture models.
Resumo:
The quality of species distribution models (SDMs) relies to a large degree on the quality of the input data, from bioclimatic indices to environmental and habitat descriptors (Austin, 2002). Recent reviews of SDM techniques, have sought to optimize predictive performance e.g. Elith et al., 2006. In general SDMs employ one of three approaches to variable selection. The simplest approach relies on the expert to select the variables, as in environmental niche models Nix, 1986 or a generalized linear model without variable selection (Miller and Franklin, 2002). A second approach explicitly incorporates variable selection into model fitting, which allows examination of particular combinations of variables. Examples include generalized linear or additive models with variable selection (Hastie et al. 2002); or classification trees with complexity or model based pruning (Breiman et al., 1984, Zeileis, 2008). A third approach uses model averaging, to summarize the overall contribution of a variable, without considering particular combinations. Examples include neural networks, boosted or bagged regression trees and Maximum Entropy as compared in Elith et al. 2006. Typically, users of SDMs will either consider a small number of variable sets, via the first approach, or else supply all of the candidate variables (often numbering more than a hundred) to the second or third approaches. Bayesian SDMs exist, with several methods for eliciting and encoding priors on model parameters (see review in Low Choy et al. 2010). However few methods have been published for informative variable selection; one example is Bayesian trees (O’Leary 2008). Here we report an elicitation protocol that helps makes explicit a priori expert judgements on the quality of candidate variables. This protocol can be flexibly applied to any of the three approaches to variable selection, described above, Bayesian or otherwise. We demonstrate how this information can be obtained then used to guide variable selection in classical or machine learning SDMs, or to define priors within Bayesian SDMs.
Resumo:
Early detection surveillance programs aim to find invasions of exotic plant pests and diseases before they are too widespread to eradicate. However, the value of these programs can be difficult to justify when no positive detections are made. To demonstrate the value of pest absence information provided by these programs, we use a hierarchical Bayesian framework to model estimates of incursion extent with and without surveillance. A model for the latent invasion process provides the baseline against which surveillance data are assessed. Ecological knowledge and pest management criteria are introduced into the model using informative priors for invasion parameters. Observation models assimilate information from spatio-temporal presence/absence data to accommodate imperfect detection and generate posterior estimates of pest extent. When applied to an early detection program operating in Queensland, Australia, the framework demonstrates that this typical surveillance regime provides a modest reduction in the estimate that a surveyed district is infested. More importantly, the model suggests that early detection surveillance programs can provide a dramatic reduction in the putative area of incursion and therefore offer a substantial benefit to incursion management. By mapping spatial estimates of the point probability of infestation, the model identifies where future surveillance resources can be most effectively deployed.
Resumo:
Motivated by the analysis of the Australian Grain Insect Resistance Database (AGIRD), we develop a Bayesian hurdle modelling approach to assess trends in strong resistance of stored grain insects to phosphine over time. The binary response variable from AGIRD indicating presence or absence of strong resistance is characterized by a majority of absence observations and the hurdle model is a two step approach that is useful when analyzing such a binary response dataset. The proposed hurdle model utilizes Bayesian classification trees to firstly identify covariates and covariate levels pertaining to possible presence or absence of strong resistance. Secondly, generalized additive models (GAMs) with spike and slab priors for variable selection are fitted to the subset of the dataset identified from the Bayesian classification tree indicating possibility of presence of strong resistance. From the GAM we assess trends, biosecurity issues and site specific variables influencing the presence of strong resistance using a variable selection approach. The proposed Bayesian hurdle model is compared to its frequentist counterpart, and also to a naive Bayesian approach which fits a GAM to the entire dataset. The Bayesian hurdle model has the benefit of providing a set of good trees for use in the first step and appears to provide enough flexibility to represent the influence of variables on strong resistance compared to the frequentist model, but also captures the subtle changes in the trend that are missed by the frequentist and naive Bayesian models. © 2014 Springer Science+Business Media New York.
Resumo:
We use Bayesian model selection techniques to test extensions of the standard flat LambdaCDM paradigm. Dark-energy and curvature scenarios, and primordial perturbation models are considered. To that end, we calculate the Bayesian evidence in favour of each model using Population Monte Carlo (PMC), a new adaptive sampling technique which was recently applied in a cosmological context. The Bayesian evidence is immediately available from the PMC sample used for parameter estimation without further computational effort, and it comes with an associated error evaluation. Besides, it provides an unbiased estimator of the evidence after any fixed number of iterations and it is naturally parallelizable, in contrast with MCMC and nested sampling methods. By comparison with analytical predictions for simulated data, we show that our results obtained with PMC are reliable and robust. The variability in the evidence evaluation and the stability for various cases are estimated both from simulations and from data. For the cases we consider, the log-evidence is calculated with a precision of better than 0.08. Using a combined set of recent CMB, SNIa and BAO data, we find inconclusive evidence between flat LambdaCDM and simple dark-energy models. A curved Universe is moderately to strongly disfavoured with respect to a flat cosmology. Using physically well-motivated priors within the slow-roll approximation of inflation, we find a weak preference for a running spectral index. A Harrison-Zel'dovich spectrum is weakly disfavoured. With the current data, tensor modes are not detected; the large prior volume on the tensor-to-scalar ratio r results in moderate evidence in favour of r=0.
Resumo:
Whether a statistician wants to complement a probability model for observed data with a prior distribution and carry out fully probabilistic inference, or base the inference only on the likelihood function, may be a fundamental question in theory, but in practice it may well be of less importance if the likelihood contains much more information than the prior. Maximum likelihood inference can be justified as a Gaussian approximation at the posterior mode, using flat priors. However, in situations where parametric assumptions in standard statistical models would be too rigid, more flexible model formulation, combined with fully probabilistic inference, can be achieved using hierarchical Bayesian parametrization. This work includes five articles, all of which apply probability modeling under various problems involving incomplete observation. Three of the papers apply maximum likelihood estimation and two of them hierarchical Bayesian modeling. Because maximum likelihood may be presented as a special case of Bayesian inference, but not the other way round, in the introductory part of this work we present a framework for probability-based inference using only Bayesian concepts. We also re-derive some results presented in the original articles using the toolbox equipped herein, to show that they are also justifiable under this more general framework. Here the assumption of exchangeability and de Finetti's representation theorem are applied repeatedly for justifying the use of standard parametric probability models with conditionally independent likelihood contributions. It is argued that this same reasoning can be applied also under sampling from a finite population. The main emphasis here is in probability-based inference under incomplete observation due to study design. This is illustrated using a generic two-phase cohort sampling design as an example. The alternative approaches presented for analysis of such a design are full likelihood, which utilizes all observed information, and conditional likelihood, which is restricted to a completely observed set, conditioning on the rule that generated that set. Conditional likelihood inference is also applied for a joint analysis of prevalence and incidence data, a situation subject to both left censoring and left truncation. Other topics covered are model uncertainty and causal inference using posterior predictive distributions. We formulate a non-parametric monotonic regression model for one or more covariates and a Bayesian estimation procedure, and apply the model in the context of optimal sequential treatment regimes, demonstrating that inference based on posterior predictive distributions is feasible also in this case.
Resumo:
The increased accuracy in the cosmological observations, especially in the measurements of the comic microwave background, allow us to study the primordial perturbations in grater detail. In this thesis, we allow the possibility for a correlated isocurvature perturbations alongside the usual adiabatic perturbations. Thus far the simplest six parameter \Lambda CDM model has been able to accommodate all the observational data rather well. However, we find that the 3-year WMAP data and the 2006 Boomerang data favour a nonzero nonadiabatic contribution to the CMB angular power sprctrum. This is primordial isocurvature perturbation that is positively correlated with the primordial curvature perturbation. Compared with the adiabatic \Lambda CMD model we have four additional parameters describing the increased complexity if the primordial perturbations. Our best-fit model has a 4% nonadiabatic contribution to the CMB temperature variance and the fit is improved by \Delta\chi^2 = 9.7. We can attribute this preference for isocurvature to a feature in the peak structure of the angular power spectrum, namely, the widths of the second and third acoustic peak. Along the way, we have improved our analysis methods by identifying some issues with the parametrisation of the primordial perturbation spectra and suggesting ways to handle these. Due to the improvements, the convergence of our Markov chains is improved. The change of parametrisation has an effect on the MCMC analysis because of the change in priors. We have checked our results against this and find only marginal differences between our parametrisation.
Resumo:
In positron emission tomography (PET), image reconstruction is a demanding problem. Since, PET image reconstruction is an ill-posed inverse problem, new methodologies need to be developed. Although previous studies show that incorporation of spatial and median priors improves the image quality, the image artifacts such as over-smoothing and streaking are evident in the reconstructed image. In this work, we use a simple, yet powerful technique to tackle the PET image reconstruction problem. Proposed technique is based on the integration of Bayesian approach with that of finite impulse response (FIR) filter. A FIR filter is designed whose coefficients are determined based on the surface diffusion model. The resulting reconstructed image is iteratively filtered and fed back to obtain the new estimate. Experiments are performed on a simulated PET system. The results show that the proposed approach is better than recently proposed MRP algorithm in terms of image quality and normalized mean square error.
Resumo:
Stochastic behavior of an aero-engine failure/repair process has been analyzed from a Bayesian perspective. Number of failures/repairs in the component-sockets of this multi-component system are assumed to follow independent renewal processes with Weibull inter-arrival times. Based on the field failure/repair data of a large number of such engines and independent Gamma priors on the scale parameters and log-concave priors on the shape parameters, an exact method of sampling from the resulting posterior distributions of the parameters has been proposed. These generated parameter values are next utilised in obtaining the posteriors of the expected number of system repairs, system failure rate, and the conditional intensity function, which are computed using a recursive formula.
Resumo:
Background: Temporal analysis of gene expression data has been limited to identifying genes whose expression varies with time and/or correlation between genes that have similar temporal profiles. Often, the methods do not consider the underlying network constraints that connect the genes. It is becoming increasingly evident that interactions change substantially with time. Thus far, there is no systematic method to relate the temporal changes in gene expression to the dynamics of interactions between them. Information on interaction dynamics would open up possibilities for discovering new mechanisms of regulation by providing valuable insight into identifying time-sensitive interactions as well as permit studies on the effect of a genetic perturbation. Results: We present NETGEM, a tractable model rooted in Markov dynamics, for analyzing the dynamics of the interactions between proteins based on the dynamics of the expression changes of the genes that encode them. The model treats the interaction strengths as random variables which are modulated by suitable priors. This approach is necessitated by the extremely small sample size of the datasets, relative to the number of interactions. The model is amenable to a linear time algorithm for efficient inference. Using temporal gene expression data, NETGEM was successful in identifying (i) temporal interactions and determining their strength, (ii) functional categories of the actively interacting partners and (iii) dynamics of interactions in perturbed networks. Conclusions: NETGEM represents an optimal trade-off between model complexity and data requirement. It was able to deduce actively interacting genes and functional categories from temporal gene expression data. It permits inference by incorporating the information available in perturbed networks. Given that the inputs to NETGEM are only the network and the temporal variation of the nodes, this algorithm promises to have widespread applications, beyond biological systems. The source code for NETGEM is available from https://github.com/vjethava/NETGEM
Resumo:
The widely used Bayesian classifier is based on the assumption of equal prior probabilities for all the classes. However, inclusion of equal prior probabilities may not guarantee high classification accuracy for the individual classes. Here, we propose a novel technique-Hybrid Bayesian Classifier (HBC)-where the class prior probabilities are determined by unmixing a supplemental low spatial-high spectral resolution multispectral (MS) data that are assigned to every pixel in a high spatial-low spectral resolution MS data in Bayesian classification. This is demonstrated with two separate experiments-first, class abundances are estimated per pixel by unmixing Moderate Resolution Imaging Spectroradiometer data to be used as prior probabilities, while posterior probabilities are determined from the training data obtained from ground. These have been used for classifying the Indian Remote Sensing Satellite LISS-III MS data through Bayesian classifier. In the second experiment, abundances obtained by unmixing Landsat Enhanced Thematic Mapper Plus are used as priors, and posterior probabilities are determined from the ground data to classify IKONOS MS images through Bayesian classifier. The results indicated that HBC systematically exploited the information from two image sources, improving the overall accuracy of LISS-III MS classification by 6% and IKONOS MS classification by 9%. Inclusion of prior probabilities increased the average producer's and user's accuracies by 5.5% and 6.5% in case of LISS-III MS with six classes and 12.5% and 5.4% in IKONOS MS for five classes considered.
Resumo:
Maximum entropy approach to classification is very well studied in applied statistics and machine learning and almost all the methods that exists in literature are discriminative in nature. In this paper, we introduce a maximum entropy classification method with feature selection for large dimensional data such as text datasets that is generative in nature. To tackle the curse of dimensionality of large data sets, we employ conditional independence assumption (Naive Bayes) and we perform feature selection simultaneously, by enforcing a `maximum discrimination' between estimated class conditional densities. For two class problems, in the proposed method, we use Jeffreys (J) divergence to discriminate the class conditional densities. To extend our method to the multi-class case, we propose a completely new approach by considering a multi-distribution divergence: we replace Jeffreys divergence by Jensen-Shannon (JS) divergence to discriminate conditional densities of multiple classes. In order to reduce computational complexity, we employ a modified Jensen-Shannon divergence (JS(GM)), based on AM-GM inequality. We show that the resulting divergence is a natural generalization of Jeffreys divergence to a multiple distributions case. As far as the theoretical justifications are concerned we show that when one intends to select the best features in a generative maximum entropy approach, maximum discrimination using J-divergence emerges naturally in binary classification. Performance and comparative study of the proposed algorithms have been demonstrated on large dimensional text and gene expression datasets that show our methods scale up very well with large dimensional datasets.
Resumo:
Consider a J-component series system which is put on Accelerated Life Test (ALT) involving K stress variables. First, a general formulation of ALT is provided for log-location-scale family of distributions. A general stress translation function of location parameter of the component log-lifetime distribution is proposed which can accommodate standard ones like Arrhenius, power-rule, log-linear model, etc., as special cases. Later, the component lives are assumed to be independent Weibull random variables with a common shape parameter. A full Bayesian methodology is then developed by letting only the scale parameters of the Weibull component lives depend on the stress variables through the general stress translation function. Priors on all the parameters, namely the stress coefficients and the Weibull shape parameter, are assumed to be log-concave and independent of each other. This assumption is to facilitate Gibbs sampling from the joint posterior. The samples thus generated from the joint posterior is then used to obtain the Bayesian point and interval estimates of the system reliability at usage condition.
Resumo:
Consider a J-component series system which is put on Accelerated Life Test (ALT) involving K stress variables. First, a general formulation of ALT is provided for log-location-scale family of distributions. A general stress translation function of location parameter of the component log-lifetime distribution is proposed which can accommodate standard ones like Arrhenius, power-rule, log-linear model, etc., as special cases. Later, the component lives are assumed to be independent Weibull random variables with a common shape parameter. A full Bayesian methodology is then developed by letting only the scale parameters of the Weibull component lives depend on the stress variables through the general stress translation function. Priors on all the parameters, namely the stress coefficients and the Weibull shape parameter, are assumed to be log-concave and independent of each other. This assumption is to facilitate Gibbs sampling from the joint posterior. The samples thus generated from the joint posterior is then used to obtain the Bayesian point and interval estimates of the system reliability at usage condition.
Resumo:
Many probabilistic models introduce strong dependencies between variables using a latent multivariate Gaussian distribution or a Gaussian process. We present a new Markov chain Monte Carlo algorithm for performing inference in models with multivariate Gaussian priors. Its key properties are: 1) it has simple, generic code applicable to many models, 2) it has no free parameters, 3) it works well for a variety of Gaussian process based models. These properties make our method ideal for use while model building, removing the need to spend time deriving and tuning updates for more complex algorithms.