58 resultados para Hierarchical Bayesian Metaanalysis
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)
Resumo:
This paper applies Hierarchical Bayesian Models to price farm-level yield insurance contracts. This methodology considers the temporal effect, the spatial dependence and spatio-temporal models. One of the major advantages of this framework is that an estimate of the premium rate is obtained directly from the posterior distribution. These methods were applied to a farm-level data set of soybean in the State of the Parana (Brazil), for the period between 1994 and 2003. The model selection was based on a posterior predictive criterion. This study improves considerably the estimation of the fair premium rates considering the small number of observations.
Resumo:
Over the years, crop insurance programs became the focus of agricultural policy in the USA, Spain, Mexico, and more recently in Brazil. Given the increasing interest in insurance, accurate calculation of the premium rate is of great importance. We address the crop-yield distribution issue and its implications in pricing an insurance contract considering the dynamic structure of the data and incorporating the spatial correlation in the Hierarchical Bayesian framework. Results show that empirical (insurers) rates are higher in low risk areas and lower in high risk areas. Such methodological improvement is primarily important in situations of limited data.
Resumo:
This article presents a statistical model of agricultural yield data based on a set of hierarchical Bayesian models that allows joint modeling of temporal and spatial autocorrelation. This method captures a comprehensive range of the various uncertainties involved in predicting crop insurance premium rates as opposed to the more traditional ad hoc, two-stage methods that are typically based on independent estimation and prediction. A panel data set of county-average yield data was analyzed for 290 counties in the State of Parana (Brazil) for the period of 1990 through 2002. Posterior predictive criteria are used to evaluate different model specifications. This article provides substantial improvements in the statistical and actuarial methods often applied to the calculation of insurance premium rates. These improvements are especially relevant to situations where data are limited.
Resumo:
In this paper, we present different ofrailtyo models to analyze longitudinal data in the presence of covariates. These models incorporate the extra-Poisson variability and the possible correlation among the repeated counting data for each individual. Assuming a CD4 counting data set in HIV-infected patients, we develop a hierarchical Bayesian analysis considering the different proposed models and using Markov Chain Monte Carlo methods. We also discuss some Bayesian discrimination aspects for the choice of the best model.
Resumo:
In this paper we present a hierarchical Bayesian analysis for a predator-prey model applied to ecology considering the use of Markov Chain Monte Carlo methods. We consider the introduction of a random effect in the model and the presence of a covariate vector. An application to ecology is considered using a data set related to the plankton dynamics of lake Geneva for the year 1990. We also discuss some aspects of discrimination of the proposed models.
Resumo:
In this paper, we compare the performance of two statistical approaches for the analysis of data obtained from the social research area. In the first approach, we use normal models with joint regression modelling for the mean and for the variance heterogeneity. In the second approach, we use hierarchical models. In the first case, individual and social variables are included in the regression modelling for the mean and for the variance, as explanatory variables, while in the second case, the variance at level 1 of the hierarchical model depends on the individuals (age of the individuals), and in the level 2 of the hierarchical model, the variance is assumed to change according to socioeconomic stratum. Applying these methodologies, we analyze a Colombian tallness data set to find differences that can be explained by socioeconomic conditions. We also present some theoretical and empirical results concerning the two models. From this comparative study, we conclude that it is better to jointly modelling the mean and variance heterogeneity in all cases. We also observe that the convergence of the Gibbs sampling chain used in the Markov Chain Monte Carlo method for the jointly modeling the mean and variance heterogeneity is quickly achieved.
Resumo:
The purpose of this paper is to develop a Bayesian analysis for nonlinear regression models under scale mixtures of skew-normal distributions. This novel class of models provides a useful generalization of the symmetrical nonlinear regression models since the error distributions cover both skewness and heavy-tailed distributions such as the skew-t, skew-slash and the skew-contaminated normal distributions. The main advantage of these class of distributions is that they have a nice hierarchical representation that allows the implementation of Markov chain Monte Carlo (MCMC) methods to simulate samples from the joint posterior distribution. In order to examine the robust aspects of this flexible class, against outlying and influential observations, we present a Bayesian case deletion influence diagnostics based on the Kullback-Leibler divergence. Further, some discussions on the model selection criteria are given. The newly developed procedures are illustrated considering two simulations study, and a real data previously analyzed under normal and skew-normal nonlinear regression models. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
This work presents a Bayesian semiparametric approach for dealing with regression models where the covariate is measured with error. Given that (1) the error normality assumption is very restrictive, and (2) assuming a specific elliptical distribution for errors (Student-t for example), may be somewhat presumptuous; there is need for more flexible methods, in terms of assuming only symmetry of errors (admitting unknown kurtosis). In this sense, the main advantage of this extended Bayesian approach is the possibility of considering generalizations of the elliptical family of models by using Dirichlet process priors in dependent and independent situations. Conditional posterior distributions are implemented, allowing the use of Markov Chain Monte Carlo (MCMC), to generate the posterior distributions. An interesting result shown is that the Dirichlet process prior is not updated in the case of the dependent elliptical model. Furthermore, an analysis of a real data set is reported to illustrate the usefulness of our approach, in dealing with outliers. Finally, semiparametric proposed models and parametric normal model are compared, graphically with the posterior distribution density of the coefficients. (C) 2009 Elsevier Inc. All rights reserved.
Resumo:
In this article, we introduce a semi-parametric Bayesian approach based on Dirichlet process priors for the discrete calibration problem in binomial regression models. An interesting topic is the dosimetry problem related to the dose-response model. A hierarchical formulation is provided so that a Markov chain Monte Carlo approach is developed. The methodology is applied to simulated and real data.
Resumo:
In this paper, we present a Bayesian approach for estimation in the skew-normal calibration model, as well as the conditional posterior distributions which are useful for implementing the Gibbs sampler. Data transformation is thus avoided by using the methodology proposed. Model fitting is implemented by proposing the asymmetric deviance information criterion, ADIC, a modification of the ordinary DIC. We also report an application of the model studied by using a real data set, related to the relationship between the resistance and the elasticity of a sample of concrete beams. Copyright (C) 2008 John Wiley & Sons, Ltd.
Resumo:
Gene clustering is a useful exploratory technique to group together genes with similar expression levels under distinct cell cycle phases or distinct conditions. It helps the biologist to identify potentially meaningful relationships between genes. In this study, we propose a clustering method based on multivariate normal mixture models, where the number of clusters is predicted via sequential hypothesis tests: at each step, the method considers a mixture model of m components (m = 2 in the first step) and tests if in fact it should be m - 1. If the hypothesis is rejected, m is increased and a new test is carried out. The method continues (increasing m) until the hypothesis is accepted. The theoretical core of the method is the full Bayesian significance test, an intuitive Bayesian approach, which needs no model complexity penalization nor positive probabilities for sharp hypotheses. Numerical experiments were based on a cDNA microarray dataset consisting of expression levels of 205 genes belonging to four functional categories, for 10 distinct strains of Saccharomyces cerevisiae. To analyze the method's sensitivity to data dimension, we performed principal components analysis on the original dataset and predicted the number of classes using 2 to 10 principal components. Compared to Mclust (model-based clustering), our method shows more consistent results.
Resumo:
Hardy-Weinberg Equilibrium (HWE) is an important genetic property that populations should have whenever they are not observing adverse situations as complete lack of panmixia, excess of mutations, excess of selection pressure, etc. HWE for decades has been evaluated; both frequentist and Bayesian methods are in use today. While historically the HWE formula was developed to examine the transmission of alleles in a population from one generation to the next, use of HWE concepts has expanded in human diseases studies to detect genotyping error and disease susceptibility (association); Ryckman and Williams (2008). Most analyses focus on trying to answer the question of whether a population is in HWE. They do not try to quantify how far from the equilibrium the population is. In this paper, we propose the use of a simple disequilibrium coefficient to a locus with two alleles. Based on the posterior density of this disequilibrium coefficient, we show how one can conduct a Bayesian analysis to verify how far from HWE a population is. There are other coefficients introduced in the literature and the advantage of the one introduced in this paper is the fact that, just like the standard correlation coefficients, its range is bounded and it is symmetric around zero (equilibrium) when comparing the positive and the negative values. To test the hypothesis of equilibrium, we use a simple Bayesian significance test, the Full Bayesian Significance Test (FBST); see Pereira, Stern andWechsler (2008) for a complete review. The disequilibrium coefficient proposed provides an easy and efficient way to make the analyses, especially if one uses Bayesian statistics. A routine in R programs (R Development Core Team, 2009) that implements the calculations is provided for the readers.
Resumo:
We propose and analyze two different Bayesian online algorithms for learning in discrete Hidden Markov Models and compare their performance with the already known Baldi-Chauvin Algorithm. Using the Kullback-Leibler divergence as a measure of generalization we draw learning curves in simplified situations for these algorithms and compare their performances.
Resumo:
Online music databases have increased significantly as a consequence of the rapid growth of the Internet and digital audio, requiring the development of faster and more efficient tools for music content analysis. Musical genres are widely used to organize music collections. In this paper, the problem of automatic single and multi-label music genre classification is addressed by exploring rhythm-based features obtained from a respective complex network representation. A Markov model is built in order to analyse the temporal sequence of rhythmic notation events. Feature analysis is performed by using two multi-variate statistical approaches: principal components analysis (unsupervised) and linear discriminant analysis (supervised). Similarly, two classifiers are applied in order to identify the category of rhythms: parametric Bayesian classifier under the Gaussian hypothesis (supervised) and agglomerative hierarchical clustering (unsupervised). Qualitative results obtained by using the kappa coefficient and the obtained clusters corroborated the effectiveness of the proposed method.
Resumo:
Chagas disease is still a major public health problem in Latin America. Its causative agent, Trypanosoma cruzi, can be typed into three major groups, T. cruzi I, T. cruzi II and hybrids. These groups each have specific genetic characteristics and epidemiological distributions. Several highly virulent strains are found in the hybrid group; their origin is still a matter of debate. The null hypothesis is that the hybrids are of polyphyletic origin, evolving independently from various hybridization events. The alternative hypothesis is that all extant hybrid strains originated from a single hybridization event. We sequenced both alleles of genes encoding EF-1 alpha, actin and SSU rDNA of 26 T. cruzi strains and DHFR-TS and TR of 12 strains. This information was used for network genealogy analysis and Bayesian phylogenies. We found T. cruzi I and T. cruzi II to be monophyletic and that all hybrids had different combinations of T. cruzi I and T. cruzi II haplotypes plus hybrid-specific haplotypes. Bootstrap values (networks) and posterior probabilities (Bayesian phylogenies) of clades supporting the monophyly of hybrids were far below the 95% confidence interval, indicating that the hybrid group is polyphyletic. We hypothesize that T. cruzi I and T. cruzi II are two different species and that the hybrids are extant representatives of independent events of genome hybridization, which sporadically have sufficient fitness to impact on the epidemiology of Chagas disease.