66 resultados para Asymptotic behaviour, Bayesian methods, Mixture models, Overfitting, Posterior concentration
Resumo:
In this article, we present a generalization of the Bayesian methodology introduced by Cepeda and Gamerman (2001) for modeling variance heterogeneity in normal regression models where we have orthogonality between mean and variance parameters to the general case considering both linear and highly nonlinear regression models. Under the Bayesian paradigm, we use MCMC methods to simulate samples for the joint posterior distribution. We illustrate this algorithm considering a simulated data set and also considering a real data set related to school attendance rate for children in Colombia. Finally, we present some extensions of the proposed MCMC algorithm.
Resumo:
Linear mixed models were developed to handle clustered data and have been a topic of increasing interest in statistics for the past 50 years. Generally. the normality (or symmetry) of the random effects is a common assumption in linear mixed models but it may, sometimes, be unrealistic, obscuring important features of among-subjects variation. In this article, we utilize skew-normal/independent distributions as a tool for robust modeling of linear mixed models under a Bayesian paradigm. The skew-normal/independent distributions is an attractive class of asymmetric heavy-tailed distributions that includes the skew-normal distribution, skew-t, skew-slash and the skew-contaminated normal distributions as special cases, providing an appealing robust alternative to the routine use of symmetric distributions in this type of models. The methods developed are illustrated using a real data set from Framingham cholesterol study. (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
The main object of this paper is to discuss the Bayes estimation of the regression coefficients in the elliptically distributed simple regression model with measurement errors. The posterior distribution for the line parameters is obtained in a closed form, considering the following: the ratio of the error variances is known, informative prior distribution for the error variance, and non-informative prior distributions for the regression coefficients and for the incidental parameters. We proved that the posterior distribution of the regression coefficients has at most two real modes. Situations with a single mode are more likely than those with two modes, especially in large samples. The precision of the modal estimators is studied by deriving the Hessian matrix, which although complicated can be computed numerically. The posterior mean is estimated by using the Gibbs sampling algorithm and approximations by normal distributions. The results are applied to a real data set and connections with results in the literature are reported. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
This work presents a Bayesian semiparametric approach for dealing with regression models where the covariate is measured with error. Given that (1) the error normality assumption is very restrictive, and (2) assuming a specific elliptical distribution for errors (Student-t for example), may be somewhat presumptuous; there is need for more flexible methods, in terms of assuming only symmetry of errors (admitting unknown kurtosis). In this sense, the main advantage of this extended Bayesian approach is the possibility of considering generalizations of the elliptical family of models by using Dirichlet process priors in dependent and independent situations. Conditional posterior distributions are implemented, allowing the use of Markov Chain Monte Carlo (MCMC), to generate the posterior distributions. An interesting result shown is that the Dirichlet process prior is not updated in the case of the dependent elliptical model. Furthermore, an analysis of a real data set is reported to illustrate the usefulness of our approach, in dealing with outliers. Finally, semiparametric proposed models and parametric normal model are compared, graphically with the posterior distribution density of the coefficients. (C) 2009 Elsevier Inc. All rights reserved.
Resumo:
We have considered a Bayesian approach for the nonlinear regression model by replacing the normal distribution on the error term by some skewed distributions, which account for both skewness and heavy tails or skewness alone. The type of data considered in this paper concerns repeated measurements taken in time on a set of individuals. Such multiple observations on the same individual generally produce serially correlated outcomes. Thus, additionally, our model does allow for a correlation between observations made from the same individual. We have illustrated the procedure using a data set to study the growth curves of a clinic measurement of a group of pregnant women from an obstetrics clinic in Santiago, Chile. Parameter estimation and prediction were carried out using appropriate posterior simulation schemes based in Markov Chain Monte Carlo methods. Besides the deviance information criterion (DIC) and the conditional predictive ordinate (CPO), we suggest the use of proper scoring rules based on the posterior predictive distribution for comparing models. For our data set, all these criteria chose the skew-t model as the best model for the errors. These DIC and CPO criteria are also validated, for the model proposed here, through a simulation study. As a conclusion of this study, the DIC criterion is not trustful for this kind of complex model.
Resumo:
In this article, we give an asymptotic formula of order n(-1/2), where n is the sample size, for the skewness of the distributions of the maximum likelihood estimates of the parameters in exponencial family nonlinear models. We generalize the result by Cordeiro and Cordeiro ( 2001). The formula is given in matrix notation and is very suitable for computer implementation and to obtain closed form expressions for a great variety of models. Some special cases and two applications are discussed.
A robust Bayesian approach to null intercept measurement error model with application to dental data
Resumo:
Measurement error models often arise in epidemiological and clinical research. Usually, in this set up it is assumed that the latent variable has a normal distribution. However, the normality assumption may not be always correct. Skew-normal/independent distribution is a class of asymmetric thick-tailed distributions which includes the Skew-normal distribution as a special case. In this paper, we explore the use of skew-normal/independent distribution as a robust alternative to null intercept measurement error model under a Bayesian paradigm. We assume that the random errors and the unobserved value of the covariate (latent variable) follows jointly a skew-normal/independent distribution, providing an appealing robust alternative to the routine use of symmetric normal distribution in this type of model. Specific distributions examined include univariate and multivariate versions of the skew-normal distribution, the skew-t distributions, the skew-slash distributions and the skew contaminated normal distributions. The methods developed is illustrated using a real data set from a dental clinical trial. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
Neste artigo apresentamos uma análise Bayesiana para o modelo de volatilidade estocástica (SV) e uma forma generalizada deste, cujo objetivo é estimar a volatilidade de séries temporais financeiras. Considerando alguns casos especiais dos modelos SV usamos algoritmos de Monte Carlo em Cadeias de Markov e o software WinBugs para obter sumários a posteriori para as diferentes formas de modelos SV. Introduzimos algumas técnicas Bayesianas de discriminação para a escolha do melhor modelo a ser usado para estimar as volatilidades e fazer previsões de séries financeiras. Um exemplo empírico de aplicação da metodologia é introduzido com a série financeira do IBOVESPA.
Resumo:
Diagnostic methods have been an important tool in regression analysis to detect anomalies, such as departures from error assumptions and the presence of outliers and influential observations with the fitted models. Assuming censored data, we considered a classical analysis and Bayesian analysis assuming no informative priors for the parameters of the model with a cure fraction. A Bayesian approach was considered by using Markov Chain Monte Carlo Methods with Metropolis-Hasting algorithms steps to obtain the posterior summaries of interest. Some influence methods, such as the local influence, total local influence of an individual, local influence on predictions and generalized leverage were derived, analyzed and discussed in survival data with a cure fraction and covariates. The relevance of the approach was illustrated with a real data set, where it is shown that, by removing the most influential observations, the decision about which model best fits the data is changed.
Resumo:
Survival or longevity is an economically important trait in beef cattle. The main inconvenience for its inclusion in selection criteria is delayed recording of phenotypic data and the high computational demand for including survival in proportional hazard models. Thus, identification of a longevity-correlated trait that could be recorded early in life would be very useful for selection purposes. We estimated the genetic relationship of survival with productive and reproductive traits in Nellore cattle, including weaning weight (WW), post-weaning growth (PWG), muscularity (MUSC), scrotal circumference at 18 months (SC18), and heifer pregnancy (HP). Survival was measured in discrete time intervals and modeled through a sequential threshold model. Five independent bivariate Bayesian analyses were performed, accounting for cow survival and the five productive and reproductive traits. Posterior mean estimates for heritability (standard deviation in parentheses) were 0.55 (0.01) for WW, 0.25 (0.01) for PWG, 0.23 (0.01) for MUSC, and 0.48 (0.01) for SC18. The posterior mean estimates (95% confidence interval in parentheses) for the genetic correlation with survival were 0.16 (0.13-0.19), 0.30 (0.25-0.34), 0.31 (0.25-0.36), 0.07 (0.02-0.12), and 0.82 (0.78-0.86) for WW, PWG, MUSC, SC18, and HP, respectively. Based on the high genetic correlation and heritability (0.54) posterior mean estimates for HP, the expected progeny difference for HP can be used to select bulls for longevity, as well as for post-weaning gain and muscle score.
Resumo:
Creation of cold dark matter (CCDM) can macroscopically be described by a negative pressure, and, therefore, the mechanism is capable to accelerate the Universe, without the need of an additional dark energy component. In this framework, we discuss the evolution of perturbations by considering a Neo-Newtonian approach where, unlike in the standard Newtonian cosmology, the fluid pressure is taken into account even in the homogeneous and isotropic background equations (Lima, Zanchin, and Brandenberger, MNRAS 291, L1, 1997). The evolution of the density contrast is calculated in the linear approximation and compared to the one predicted by the Lambda CDM model. The difference between the CCDM and Lambda CDM predictions at the perturbative level is quantified by using three different statistical methods, namely: a simple chi(2)-analysis in the relevant space parameter, a Bayesian statistical inference, and, finally, a Kolmogorov-Smirnov test. We find that under certain circumstances, the CCDM scenario analyzed here predicts an overall dynamics (including Hubble flow and matter fluctuation field) which fully recovers that of the traditional cosmic concordance model. Our basic conclusion is that such a reduction of the dark sector provides a viable alternative description to the accelerating Lambda CDM cosmology.
Resumo:
We propose and analyze two different Bayesian online algorithms for learning in discrete Hidden Markov Models and compare their performance with the already known Baldi-Chauvin Algorithm. Using the Kullback-Leibler divergence as a measure of generalization we draw learning curves in simplified situations for these algorithms and compare their performances.
Resumo:
Motivation: Understanding the patterns of association between polymorphisms at different loci in a population ( linkage disequilibrium, LD) is of fundamental importance in various genetic studies. Many coefficients were proposed for measuring the degree of LD, but they provide only a static view of the current LD structure. Generative models (GMs) were proposed to go beyond these measures, giving not only a description of the actual LD structure but also a tool to help understanding the process that generated such structure. GMs based in coalescent theory have been the most appealing because they link LD to evolutionary factors. Nevertheless, the inference and parameter estimation of such models is still computationally challenging. Results: We present a more practical method to build GM that describe LD. The method is based on learning weighted Bayesian network structures from haplotype data, extracting equivalence structure classes and using them to model LD. The results obtained in public data from the HapMap database showed that the method is a promising tool for modeling LD. The associations represented by the learned models are correlated with the traditional measure of LD D`. The method was able to represent LD blocks found by standard tools. The granularity of the association blocks and the readability of the models can be controlled in the method. The results suggest that the causality information gained by our method can be useful to tell about the conservability of the genetic markers and to guide the selection of subset of representative markers.
Resumo:
The airflow velocities and pressures are calculated from a three-dimensional model of the human larynx by using the finite element method. The laryngeal airflow is assumed to be incompressible, isothermal, steady, and created by fixed pressure drops. The influence of different laryngeal profiles (convergent, parallel, and divergent), glottal area, and dimensions of false vocal folds in the airflow are investigated. The results indicate that vertical and horizontal phase differences in the laryngeal tissue movements are influenced by the nonlinear pressure distribution across the glottal channel, and the glottal entrance shape influences the air pressure distribution inside the glottis. Additionally, the false vocal folds increase the glottal duct pressure drop by creating a new constricted channel in the larynx, and alter the airflow vortexes formed after the true vocal folds. (C) 2007 Elsevier Ltd. All rights reserved.
Resumo:
In this paper, we present various diagnostic methods for polyhazard models. Polyhazard models are a flexible family for fitting lifetime data. Their main advantage over the single hazard models, such as the Weibull and the log-logistic models, is to include a large amount of nonmonotone hazard shapes, as bathtub and multimodal curves. Some influence methods, such as the local influence and total local influence of an individual are derived, analyzed and discussed. A discussion of the computation of the likelihood displacement as well as the normal curvature in the local influence method are presented. Finally, an example with real data is given for illustration.