863 resultados para nonparametric inference
Resumo:
State-space models are successfully used in many areas of science, engineering and economics to model time series and dynamical systems. We present a fully Bayesian approach to inference and learning (i.e. state estimation and system identification) in nonlinear nonparametric state-space models. We place a Gaussian process prior over the state transition dynamics, resulting in a flexible model able to capture complex dynamical phenomena. To enable efficient inference, we marginalize over the transition dynamics function and, instead, infer directly the joint smoothing distribution using specially tailored Particle Markov Chain Monte Carlo samplers. Once a sample from the smoothing distribution is computed, the state transition predictive distribution can be formulated analytically. Our approach preserves the full nonparametric expressivity of the model and can make use of sparse Gaussian processes to greatly reduce computational complexity.
Resumo:
BACKGROUND: Nonparametric Bayesian techniques have been developed recently to extend the sophistication of factor models, allowing one to infer the number of appropriate factors from the observed data. We consider such techniques for sparse factor analysis, with application to gene-expression data from three virus challenge studies. Particular attention is placed on employing the Beta Process (BP), the Indian Buffet Process (IBP), and related sparseness-promoting techniques to infer a proper number of factors. The posterior density function on the model parameters is computed using Gibbs sampling and variational Bayesian (VB) analysis. RESULTS: Time-evolving gene-expression data are considered for respiratory syncytial virus (RSV), Rhino virus, and influenza, using blood samples from healthy human subjects. These data were acquired in three challenge studies, each executed after receiving institutional review board (IRB) approval from Duke University. Comparisons are made between several alternative means of per-forming nonparametric factor analysis on these data, with comparisons as well to sparse-PCA and Penalized Matrix Decomposition (PMD), closely related non-Bayesian approaches. CONCLUSIONS: Applying the Beta Process to the factor scores, or to the singular values of a pseudo-SVD construction, the proposed algorithms infer the number of factors in gene-expression data. For real data the "true" number of factors is unknown; in our simulations we consider a range of noise variances, and the proposed Bayesian models inferred the number of factors accurately relative to other methods in the literature, such as sparse-PCA and PMD. We have also identified a "pan-viral" factor of importance for each of the three viruses considered in this study. We have identified a set of genes associated with this pan-viral factor, of interest for early detection of such viruses based upon the host response, as quantified via gene-expression data.
Resumo:
We discuss statistical inference problems associated with identification and testability in econometrics, and we emphasize the common nature of the two issues. After reviewing the relevant statistical notions, we consider in turn inference in nonparametric models and recent developments on weakly identified models (or weak instruments). We point out that many hypotheses, for which test procedures are commonly proposed, are not testable at all, while some frequently used econometric methods are fundamentally inappropriate for the models considered. Such situations lead to ill-defined statistical problems and are often associated with a misguided use of asymptotic distributional results. Concerning nonparametric hypotheses, we discuss three basic problems for which such difficulties occur: (1) testing a mean (or a moment) under (too) weak distributional assumptions; (2) inference under heteroskedasticity of unknown form; (3) inference in dynamic models with an unlimited number of parameters. Concerning weakly identified models, we stress that valid inference should be based on proper pivotal functions —a condition not satisfied by standard Wald-type methods based on standard errors — and we discuss recent developments in this field, mainly from the viewpoint of building valid tests and confidence sets. The techniques discussed include alternative proposed statistics, bounds, projection, split-sampling, conditioning, Monte Carlo tests. The possibility of deriving a finite-sample distributional theory, robustness to the presence of weak instruments, and robustness to the specification of a model for endogenous explanatory variables are stressed as important criteria assessing alternative procedures.
Resumo:
Department of Statistics, Cochin University of Science and Technology
Resumo:
Event-related functional magnetic resonance imaging (efMRI) has emerged as a powerful technique for detecting brains' responses to presented stimuli. A primary goal in efMRI data analysis is to estimate the Hemodynamic Response Function (HRF) and to locate activated regions in human brains when specific tasks are performed. This paper develops new methodologies that are important improvements not only to parametric but also to nonparametric estimation and hypothesis testing of the HRF. First, an effective and computationally fast scheme for estimating the error covariance matrix for efMRI is proposed. Second, methodologies for estimation and hypothesis testing of the HRF are developed. Simulations support the effectiveness of our proposed methods. When applied to an efMRI dataset from an emotional control study, our method reveals more meaningful findings than the popular methods offered by AFNI and FSL. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
We study semiparametric two-step estimators which have the same structure as parametric doubly robust estimators in their second step. The key difference is that we do not impose any parametric restriction on the nuisance functions that are estimated in a first stage, but retain a fully nonparametric model instead. We call these estimators semiparametric doubly robust estimators (SDREs), and show that they possess superior theoretical and practical properties compared to generic semiparametric two-step estimators. In particular, our estimators have substantially smaller first-order bias, allow for a wider range of nonparametric first-stage estimates, rate-optimal choices of smoothing parameters and data-driven estimates thereof, and their stochastic behavior can be well-approximated by classical first-order asymptotics. SDREs exist for a wide range of parameters of interest, particularly in semiparametric missing data and causal inference models. We illustrate our method with a simulation exercise.
Resumo:
In the context of Bayesian statistical analysis, elicitation is the process of formulating a prior density f(.) about one or more uncertain quantities to represent a person's knowledge and beliefs. Several different methods of eliciting prior distributions for one unknown parameter have been proposed. However, there are relatively few methods for specifying a multivariate prior distribution and most are just applicable to specific classes of problems and/or based on restrictive conditions, such as independence of variables. Besides, many of these procedures require the elicitation of variances and correlations, and sometimes elicitation of hyperparameters which are difficult for experts to specify in practice. Garthwaite et al. (2005) discuss the different methods proposed in the literature and the difficulties of eliciting multivariate prior distributions. We describe a flexible method of eliciting multivariate prior distributions applicable to a wide class of practical problems. Our approach does not assume a parametric form for the unknown prior density f(.), instead we use nonparametric Bayesian inference, modelling f(.) by a Gaussian process prior distribution. The expert is then asked to specify certain summaries of his/her distribution, such as the mean, mode, marginal quantiles and a small number of joint probabilities. The analyst receives that information, treating it as a data set D with which to update his/her prior beliefs to obtain the posterior distribution for f(.). Theoretical properties of joint and marginal priors are derived and numerical illustrations to demonstrate our approach are given. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
A Bayesian nonparametric model for Taguchi's on-line quality monitoring procedure for attributes is introduced. The proposed model may accommodate the original single shift setting to the more realistic situation of gradual quality deterioration and allows the incorporation of an expert's opinion on the production process. Based on the number of inspections to be carried out until a defective item is found, the Bayesian operation for the distribution function that represents the increasing sequence of defective fractions during a cycle considering a mixture of Dirichlet processes as prior distribution is performed. Bayes estimates for relevant quantities are also obtained. © 2012 Elsevier B.V.
Resumo:
A Bayesian nonparametric model for Taguchi's on-line quality monitoring procedure for attributes is introduced. The proposed model may accommodate the original single shift setting to the more realistic situation of gradual quality deterioration and allows the incorporation of an expert's opinion on the production process. Based on the number of inspections to be carried out until a defective item is found, the Bayesian operation for the distribution function that represents the increasing sequence of defective fractions during a cycle considering a mixture of Dirichlet processes as prior distribution is performed. Bayes estimates for relevant quantities are also obtained. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
In this treatise we consider finite systems of branching particles where the particles move independently of each other according to d-dimensional diffusions. Particles are killed at a position dependent rate, leaving at their death position a random number of descendants according to a position dependent reproduction law. In addition particles immigrate at constant rate (one immigrant per immigration time). A process with above properties is called a branching diffusion withimmigration (BDI). In the first part we present the model in detail and discuss the properties of the BDI under our basic assumptions. In the second part we consider the problem of reconstruction of the trajectory of a BDI from discrete observations. We observe positions of the particles at discrete times; in particular we assume that we have no information about the pedigree of the particles. A natural question arises if we want to apply statistical procedures on the discrete observations: How can we find couples of particle positions which belong to the same particle? We give an easy to implement 'reconstruction scheme' which allows us to redraw or 'reconstruct' parts of the trajectory of the BDI with high accuracy. Moreover asymptotically the whole path can be reconstructed. Further we present simulations which show that our partial reconstruction rule is tractable in practice. In the third part we study how the partial reconstruction rule fits into statistical applications. As an extensive example we present a nonparametric estimator for the diffusion coefficient of a BDI where the particles move according to one-dimensional diffusions. This estimator is based on the Nadaraya-Watson estimator for the diffusion coefficient of one-dimensional diffusions and it uses the partial reconstruction rule developed in the second part above. We are able to prove a rate of convergence of this estimator and finally we present simulations which show that the estimator works well even if we leave our set of assumptions.
Resumo:
Bayesian phylogenetic analyses are now very popular in systematics and molecular evolution because they allow the use of much more realistic models than currently possible with maximum likelihood methods. There are, however, a growing number of examples in which large Bayesian posterior clade probabilities are associated with very short edge lengths and low values for non-Bayesian measures of support such as nonparametric bootstrapping. For the four-taxon case when the true tree is the star phylogeny, Bayesian analyses become increasingly unpredictable in their preference for one of the three possible resolved tree topologies as data set size increases. This leads to the prediction that hard (or near-hard) polytomies in nature will cause unpredictable behavior in Bayesian analyses, with arbitrary resolutions of the polytomy receiving very high posterior probabilities in some cases. We present a simple solution to this problem involving a reversible-jump Markov chain Monte Carlo (MCMC) algorithm that allows exploration of all of tree space, including unresolved tree topologies with one or more polytomies. The reversible-jump MCMC approach allows prior distributions to place some weight on less-resolved tree topologies, which eliminates misleadingly high posteriors associated with arbitrary resolutions of hard polytomies. Fortunately, assigning some prior probability to polytomous tree topologies does not appear to come with a significant cost in terms of the ability to assess the level of support for edges that do exist in the true tree. Methods are discussed for applying arbitrary prior distributions to tree topologies of varying resolution, and an empirical example showing evidence of polytomies is analyzed and discussed.
Resumo:
Bayesian phylogenetic analyses are now very popular in systematics and molecular evolution because they allow the use of much more realistic models than currently possible with maximum likelihood methods. There are, however, a growing number of examples in which large Bayesian posterior clade probabilities are associated with very short edge lengths and low values for non-Bayesian measures of support such as nonparametric bootstrapping. For the four-taxon case when the true tree is the star phylogeny, Bayesian analyses become increasingly unpredictable in their preference for one of the three possible resolved tree topologies as data set size increases. This leads to the prediction that hard (or near-hard) polytomies in nature will cause unpredictable behavior in Bayesian analyses, with arbitrary resolutions of the polytomy receiving very high posterior probabilities in some cases. We present a simple solution to this problem involving a reversible-jump Markov chain Monte Carlo (MCMC) algorithm that allows exploration of all of tree space, including unresolved tree topologies with one or more polytomies. The reversible-jump MCMC approach allows prior distributions to place some weight on less-resolved tree topologies, which eliminates misleadingly high posteriors associated with arbitrary resolutions of hard polytomies. Fortunately, assigning some prior probability to polytomous tree topologies does not appear to come with a significant cost in terms of the ability to assess the level of support for edges that do exist in the true tree. Methods are discussed for applying arbitrary prior distributions to tree topologies of varying resolution, and an empirical example showing evidence of polytomies is analyzed and discussed.
Resumo:
The objective of this thesis is the development of cooperative localization and tracking algorithms using nonparametric message passing techniques. In contrast to the most well-known techniques, the goal is to estimate the posterior probability density function (PDF) of the position of each sensor. This problem can be solved using Bayesian approach, but it is intractable in general case. Nevertheless, the particle-based approximation (via nonparametric representation), and an appropriate factorization of the joint PDFs (using message passing methods), make Bayesian approach acceptable for inference in sensor networks. The well-known method for this problem, nonparametric belief propagation (NBP), can lead to inaccurate beliefs and possible non-convergence in loopy networks. Therefore, we propose four novel algorithms which alleviate these problems: nonparametric generalized belief propagation (NGBP) based on junction tree (NGBP-JT), NGBP based on pseudo-junction tree (NGBP-PJT), NBP based on spanning trees (NBP-ST), and uniformly-reweighted NBP (URW-NBP). We also extend NBP for cooperative localization in mobile networks. In contrast to the previous methods, we use an optional smoothing, provide a novel communication protocol, and increase the efficiency of the sampling techniques. Moreover, we propose novel algorithms for distributed tracking, in which the goal is to track the passive object which cannot locate itself. In particular, we develop distributed particle filtering (DPF) based on three asynchronous belief consensus (BC) algorithms: standard belief consensus (SBC), broadcast gossip (BG), and belief propagation (BP). Finally, the last part of this thesis includes the experimental analysis of some of the proposed algorithms, in which we found that the results based on real measurements are very similar with the results based on theoretical models.
Resumo:
Nonparametric belief propagation (NBP) is a well-known particle-based method for distributed inference in wireless networks. NBP has a large number of applications, including cooperative localization. However, in loopy networks NBP suffers from similar problems as standard BP, such as over-confident beliefs and possible nonconvergence. Tree-reweighted NBP (TRW-NBP) can mitigate these problems, but does not easily lead to a distributed implementation due to the non-local nature of the required so-called edge appearance probabilities. In this paper, we propose a variation of TRWNBP, suitable for cooperative localization in wireless networks. Our algorithm uses a fixed edge appearance probability for every edge, and can outperform standard NBP in dense wireless networks.
Resumo:
Estimation of economic relationships often requires imposition of constraints such as positivity or monotonicity on each observation. Methods to impose such constraints, however, vary depending upon the estimation technique employed. We describe a general methodology to impose (observation-specific) constraints for the class of linear regression estimators using a method known as constraint weighted bootstrapping. While this method has received attention in the nonparametric regression literature, we show how it can be applied for both parametric and nonparametric estimators. A benefit of this method is that imposing numerous constraints simultaneously can be performed seamlessly. We apply this method to Norwegian dairy farm data to estimate both unconstrained and constrained parametric and nonparametric models.