997 resultados para Posterior distribution


Relevância:

60.00% 60.00%

Publicador:

Resumo:

In the context of Bayesian statistical analysis, elicitation is the process of formulating a prior density f(.) about one or more uncertain quantities to represent a person's knowledge and beliefs. Several different methods of eliciting prior distributions for one unknown parameter have been proposed. However, there are relatively few methods for specifying a multivariate prior distribution and most are just applicable to specific classes of problems and/or based on restrictive conditions, such as independence of variables. Besides, many of these procedures require the elicitation of variances and correlations, and sometimes elicitation of hyperparameters which are difficult for experts to specify in practice. Garthwaite et al. (2005) discuss the different methods proposed in the literature and the difficulties of eliciting multivariate prior distributions. We describe a flexible method of eliciting multivariate prior distributions applicable to a wide class of practical problems. Our approach does not assume a parametric form for the unknown prior density f(.), instead we use nonparametric Bayesian inference, modelling f(.) by a Gaussian process prior distribution. The expert is then asked to specify certain summaries of his/her distribution, such as the mean, mode, marginal quantiles and a small number of joint probabilities. The analyst receives that information, treating it as a data set D with which to update his/her prior beliefs to obtain the posterior distribution for f(.). Theoretical properties of joint and marginal priors are derived and numerical illustrations to demonstrate our approach are given. (C) 2010 Elsevier B.V. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The multivariate t models are symmetric and with heavier tail than the normal distribution, important feature in financial data. In this theses is presented the Bayesian estimation of a dynamic factor model, where the factors follow a multivariate autoregressive model, using multivariate t distribution. Since the multivariate t distribution is complex, it was represented in this work as a mix between a multivariate normal distribution and a square root of a chi-square distribution. This method allowed to define the posteriors. The inference on the parameters was made taking a sample of the posterior distribution, through the Gibbs Sampler. The convergence was verified through graphical analysis and the convergence tests Geweke (1992) and Raftery & Lewis (1992a). The method was applied in simulated data and in the indexes of the major stock exchanges in the world.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A new methodology is being devised for ensemble ocean forecasting using distributions of the surface wind field derived from a Bayesian Hierarchical Model (BHM). The ocean members are forced with samples from the posterior distribution of the wind during the assimilation of satellite and in-situ ocean data. The initial condition perturbations are then consistent with the best available knowledge of the ocean state at the beginning of the forecast and amplify the ocean response to uncertainty only in the forcing. The ECMWF Ensemble Prediction System (EPS) surface winds are also used to generate a reference ocean ensemble to evaluate the performance of the BHM method that proves to be eective in concentrating the forecast uncertainty at the ocean meso-scale. An height month experiment of weekly BHM ensemble forecasts was performed in the framework of the operational Mediterranean Forecasting System. The statistical properties of the ensemble are compared with model errors throughout the seasonal cycle proving the existence of a strong relationship between forecast uncertainties due to atmospheric forcing and the seasonal cycle.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In my PhD thesis I propose a Bayesian nonparametric estimation method for structural econometric models where the functional parameter of interest describes the economic agent's behavior. The structural parameter is characterized as the solution of a functional equation, or by using more technical words, as the solution of an inverse problem that can be either ill-posed or well-posed. From a Bayesian point of view, the parameter of interest is a random function and the solution to the inference problem is the posterior distribution of this parameter. A regular version of the posterior distribution in functional spaces is characterized. However, the infinite dimension of the considered spaces causes a problem of non continuity of the solution and then a problem of inconsistency, from a frequentist point of view, of the posterior distribution (i.e. problem of ill-posedness). The contribution of this essay is to propose new methods to deal with this problem of ill-posedness. The first one consists in adopting a Tikhonov regularization scheme in the construction of the posterior distribution so that I end up with a new object that I call regularized posterior distribution and that I guess it is solution of the inverse problem. The second approach consists in specifying a prior distribution on the parameter of interest of the g-prior type. Then, I detect a class of models for which the prior distribution is able to correct for the ill-posedness also in infinite dimensional problems. I study asymptotic properties of these proposed solutions and I prove that, under some regularity condition satisfied by the true value of the parameter of interest, they are consistent in a "frequentist" sense. Once I have set the general theory, I apply my bayesian nonparametric methodology to different estimation problems. First, I apply this estimator to deconvolution and to hazard rate, density and regression estimation. Then, I consider the estimation of an Instrumental Regression that is useful in micro-econometrics when we have to deal with problems of endogeneity. Finally, I develop an application in finance: I get the bayesian estimator for the equilibrium asset pricing functional by using the Euler equation defined in the Lucas'(1978) tree-type models.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The aim of the thesi is to formulate a suitable Item Response Theory (IRT) based model to measure HRQoL (as latent variable) using a mixed responses questionnaire and relaxing the hypothesis of normal distributed latent variable. The new model is a combination of two models already presented in literature, that is, a latent trait model for mixed responses and an IRT model for Skew Normal latent variable. It is developed in a Bayesian framework, a Markov chain Monte Carlo procedure is used to generate samples of the posterior distribution of the parameters of interest. The proposed model is test on a questionnaire composed by 5 discrete items and one continuous to measure HRQoL in children, the EQ-5D-Y questionnaire. A large sample of children collected in the schools was used. In comparison with a model for only discrete responses and a model for mixed responses and normal latent variable, the new model has better performances, in term of deviance information criterion (DIC), chain convergences times and precision of the estimates.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Changepoint analysis is a well established area of statistical research, but in the context of spatio-temporal point processes it is as yet relatively unexplored. Some substantial differences with regard to standard changepoint analysis have to be taken into account: firstly, at every time point the datum is an irregular pattern of points; secondly, in real situations issues of spatial dependence between points and temporal dependence within time segments raise. Our motivating example consists of data concerning the monitoring and recovery of radioactive particles from Sandside beach, North of Scotland; there have been two major changes in the equipment used to detect the particles, representing known potential changepoints in the number of retrieved particles. In addition, offshore particle retrieval campaigns are believed may reduce the particle intensity onshore with an unknown temporal lag; in this latter case, the problem concerns multiple unknown changepoints. We therefore propose a Bayesian approach for detecting multiple changepoints in the intensity function of a spatio-temporal point process, allowing for spatial and temporal dependence within segments. We use Log-Gaussian Cox Processes, a very flexible class of models suitable for environmental applications that can be implemented using integrated nested Laplace approximation (INLA), a computationally efficient alternative to Monte Carlo Markov Chain methods for approximating the posterior distribution of the parameters. Once the posterior curve is obtained, we propose a few methods for detecting significant change points. We present a simulation study, which consists in generating spatio-temporal point pattern series under several scenarios; the performance of the methods is assessed in terms of type I and II errors, detected changepoint locations and accuracy of the segment intensity estimates. We finally apply the above methods to the motivating dataset and find good and sensible results about the presence and quality of changes in the process.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We describe a Bayesian method for estimating the number of essential genes in a genome, on the basis of data on viable mutants for which a single transposon was inserted after a random TA site in a genome,potentially disrupting a gene. The prior distribution for the number of essential genes was taken to be uniform. A Gibbs sampler was used to estimate the posterior distribution. The method is illustrated with simulated data. Further simulations were used to study the performance of the procedure.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Genomic alterations have been linked to the development and progression of cancer. The technique of Comparative Genomic Hybridization (CGH) yields data consisting of fluorescence intensity ratios of test and reference DNA samples. The intensity ratios provide information about the number of copies in DNA. Practical issues such as the contamination of tumor cells in tissue specimens and normalization errors necessitate the use of statistics for learning about the genomic alterations from array-CGH data. As increasing amounts of array CGH data become available, there is a growing need for automated algorithms for characterizing genomic profiles. Specifically, there is a need for algorithms that can identify gains and losses in the number of copies based on statistical considerations, rather than merely detect trends in the data. We adopt a Bayesian approach, relying on the hidden Markov model to account for the inherent dependence in the intensity ratios. Posterior inferences are made about gains and losses in copy number. Localized amplifications (associated with oncogene mutations) and deletions (associated with mutations of tumor suppressors) are identified using posterior probabilities. Global trends such as extended regions of altered copy number are detected. Since the posterior distribution is analytically intractable, we implement a Metropolis-within-Gibbs algorithm for efficient simulation-based inference. Publicly available data on pancreatic adenocarcinoma, glioblastoma multiforme and breast cancer are analyzed, and comparisons are made with some widely-used algorithms to illustrate the reliability and success of the technique.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Medical errors originating in health care facilities are a significant source of preventable morbidity, mortality, and healthcare costs. Voluntary error report systems that collect information on the causes and contributing factors of medi- cal errors regardless of the resulting harm may be useful for developing effective harm prevention strategies. Some patient safety experts question the utility of data from errors that did not lead to harm to the patient, also called near misses. A near miss (a.k.a. close call) is an unplanned event that did not result in injury to the patient. Only a fortunate break in the chain of events prevented injury. We use data from a large voluntary reporting system of 836,174 medication errors from 1999 to 2005 to provide evidence that the causes and contributing factors of errors that result in harm are similar to the causes and contributing factors of near misses. We develop Bayesian hierarchical models for estimating the log odds of selecting a given cause (or contributing factor) of error given harm has occurred and the log odds of selecting the same cause given that harm did not occur. The posterior distribution of the correlation between these two vectors of log-odds is used as a measure of the evidence supporting the use of data from near misses and their causes and contributing factors to prevent medical errors. In addition, we identify the causes and contributing factors that have the highest or lowest log-odds ratio of harm versus no harm. These causes and contributing factors should also be a focus in the design of prevention strategies. This paper provides important evidence on the utility of data from near misses, which constitute the vast majority of errors in our data.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Simulation-based assessment is a popular and frequently necessary approach to evaluation of statistical procedures. Sometimes overlooked is the ability to take advantage of underlying mathematical relations and we focus on this aspect. We show how to take advantage of large-sample theory when conducting a simulation using the analysis of genomic data as a motivating example. The approach uses convergence results to provide an approximation to smaller-sample results, results that are available only by simulation. We consider evaluating and comparing a variety of ranking-based methods for identifying the most highly associated SNPs in a genome-wide association study, derive integral equation representations of the pre-posterior distribution of percentiles produced by three ranking methods, and provide examples comparing performance. These results are of interest in their own right and set the framework for a more extensive set of comparisons.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this thesis, we consider Bayesian inference on the detection of variance change-point models with scale mixtures of normal (for short SMN) distributions. This class of distributions is symmetric and thick-tailed and includes as special cases: Gaussian, Student-t, contaminated normal, and slash distributions. The proposed models provide greater flexibility to analyze a lot of practical data, which often show heavy-tail and may not satisfy the normal assumption. As to the Bayesian analysis, we specify some prior distributions for the unknown parameters in the variance change-point models with the SMN distributions. Due to the complexity of the joint posterior distribution, we propose an efficient Gibbs-type with Metropolis- Hastings sampling algorithm for posterior Bayesian inference. Thereafter, following the idea of [1], we consider the problems of the single and multiple change-point detections. The performance of the proposed procedures is illustrated and analyzed by simulation studies. A real application to the closing price data of U.S. stock market has been analyzed for illustrative purposes.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Monte Carlo simulation was used to evaluate properties of a simple Bayesian MCMC analysis of the random effects model for single group Cormack-Jolly-Seber capture-recapture data. The MCMC method is applied to the model via a logit link, so parameters p, S are on a logit scale, where logit(S) is assumed to have, and is generated from, a normal distribution with mean μ and variance σ2 . Marginal prior distributions on logit(p) and μ were independent normal with mean zero and standard deviation 1.75 for logit(p) and 100 for μ ; hence minimally informative. Marginal prior distribution on σ2 was placed on τ2=1/σ2 as a gamma distribution with α=β=0.001 . The study design has 432 points spread over 5 factors: occasions (t) , new releases per occasion (u), p, μ , and σ . At each design point 100 independent trials were completed (hence 43,200 trials in total), each with sample size n=10,000 from the parameter posterior distribution. At 128 of these design points comparisons are made to previously reported results from a method of moments procedure. We looked at properties of point and interval inference on μ , and σ based on the posterior mean, median, and mode and equal-tailed 95% credibility interval. Bayesian inference did very well for the parameter μ , but under the conditions used here, MCMC inference performance for σ was mixed: poor for sparse data (i.e., only 7 occasions) or σ=0 , but good when there were sufficient data and not small σ .

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We consider the problem of twenty questions with noisy answers, in which we seek to find a target by repeatedly choosing a set, asking an oracle whether the target lies in this set, and obtaining an answer corrupted by noise. Starting with a prior distribution on the target's location, we seek to minimize the expected entropy of the posterior distribution. We formulate this problem as a dynamic program and show that any policy optimizing the one-step expected reduction in entropy is also optimal over the full horizon. Two such Bayes optimal policies are presented: one generalizes the probabilistic bisection policy due to Horstein and the other asks a deterministic set of questions. We study the structural properties of the latter, and illustrate its use in a computer vision application.