36 resultados para Gibbs sampling

em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Joint generalized linear models and double generalized linear models (DGLMs) were designed to model outcomes for which the variability can be explained using factors and/or covariates. When such factors operate, the usual normal regression models, which inherently exhibit constant variance, will under-represent variation in the data and hence may lead to erroneous inferences. For count and proportion data, such noise factors can generate a so-called overdispersion effect, and the use of binomial and Poisson models underestimates the variability and, consequently, incorrectly indicate significant effects. In this manuscript, we propose a DGLM from a Bayesian perspective, focusing on the case of proportion data, where the overdispersion can be modeled using a random effect that depends on some noise factors. The posterior joint density function was sampled using Monte Carlo Markov Chain algorithms, allowing inferences over the model parameters. An application to a data set on apple tissue culture is presented, for which it is shown that the Bayesian approach is quite feasible, even when limited prior information is available, thereby generating valuable insight for the researcher about its experimental results.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The objective of this study was to evaluate the possible use of biometric testicular traits as selection criteria for young Nellore bulls using Bayesian inference to estimate heritability coefficients and genetic correlations. Multitrait analysis was performed including 17,211 records of scrotal circumference obtained during andrological assessment (SCAND) and 15,313 records of testicular volume and shape. In addition, 50,809 records of scrotal circumference at 18 mo (SC18), used as an anchor trait, were analyzed. The (co) variance components and breeding values were estimated by Gibbs sampling using the Gibbs2F90 program under an animal model that included contemporary groups as fixed effects, age of the animal as a linear covariate, and direct additive genetic effects as random effects. Heritabilities of 0.42, 0.43, 0.31, 0.20, 0.04, 0.16, 0.15, and 0.10 were obtained for SC18, SCAND, testicular volume, testicular shape, minor defects, major defects, total defects, and satisfactory andrological evaluation, respectively. The genetic correlations between SC18 and the other traits were 0.84 (SCAND), 0.75 (testicular shape), 0.44 (testicular volume), -0.23 (minor defects), -0.16 (major defects), -0.24 (total defects), and 0.56 (satisfactory andrological evaluation). Genetic correlations of 0.94 and 0.52 were obtained between SCAND and testicular volume and shape, respectively, and of 0.52 between testicular volume and testicular shape. In addition to favorable genetic parameter estimates, SC18 was found to be the most advantageous testicular trait due to its easy measurement before andrological assessment of the animals, even though the utilization of biometric testicular traits as selection criteria was also found to be possible. In conclusion, SC18 and biometric testicular traits can be adopted as a selection criterion to improve the fertility of young Nellore bulls.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper, we consider some non-homogeneous Poisson models to estimate the probability that an air quality standard is exceeded a given number of times in a time interval of interest. We assume that the number of exceedances occurs according to a non-homogeneous Poisson process (NHPP). This Poisson process has rate function lambda(t), t >= 0, which depends on some parameters that must be estimated. We take into account two cases of rate functions: the Weibull and the Goel-Okumoto. We consider models with and without change-points. When the presence of change-points is assumed, we may have the presence of either one, two or three change-points, depending of the data set. The parameters of the rate functions are estimated using a Gibbs sampling algorithm. Results are applied to ozone data provided by the Mexico City monitoring network. In a first instance, we assume that there are no change-points present. Depending on the adjustment of the model, we assume the presence of either one, two or three change-points. Copyright (C) 2009 John Wiley & Sons, Ltd.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

It is known that patients may cease participating in a longitudinal study and become lost to follow-up. The objective of this article is to present a Bayesian model to estimate the malaria transition probabilities considering individuals lost to follow-up. We consider a homogeneous population, and it is assumed that the considered period of time is small enough to avoid two or more transitions from one state of health to another. The proposed model is based on a Gibbs sampling algorithm that uses information of lost to follow-up at the end of the longitudinal study. To simulate the unknown number of individuals with positive and negative states of malaria at the end of the study and lost to follow-up, two latent variables were introduced in the model. We used a real data set and a simulated data to illustrate the application of the methodology. The proposed model showed a good fit to these data sets, and the algorithm did not show problems of convergence or lack of identifiability. We conclude that the proposed model is a good alternative to estimate probabilities of transitions from one state of health to the other in studies with low adherence to follow-up.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper, we compare the performance of two statistical approaches for the analysis of data obtained from the social research area. In the first approach, we use normal models with joint regression modelling for the mean and for the variance heterogeneity. In the second approach, we use hierarchical models. In the first case, individual and social variables are included in the regression modelling for the mean and for the variance, as explanatory variables, while in the second case, the variance at level 1 of the hierarchical model depends on the individuals (age of the individuals), and in the level 2 of the hierarchical model, the variance is assumed to change according to socioeconomic stratum. Applying these methodologies, we analyze a Colombian tallness data set to find differences that can be explained by socioeconomic conditions. We also present some theoretical and empirical results concerning the two models. From this comparative study, we conclude that it is better to jointly modelling the mean and variance heterogeneity in all cases. We also observe that the convergence of the Gibbs sampling chain used in the Markov Chain Monte Carlo method for the jointly modeling the mean and variance heterogeneity is quickly achieved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper, we consider the problem of estimating the number of times an air quality standard is exceeded in a given period of time. A non-homogeneous Poisson model is proposed to analyse this issue. The rate at which the Poisson events occur is given by a rate function lambda(t), t >= 0. This rate function also depends on some parameters that need to be estimated. Two forms of lambda(t), t >= 0 are considered. One of them is of the Weibull form and the other is of the exponentiated-Weibull form. The parameters estimation is made using a Bayesian formulation based on the Gibbs sampling algorithm. The assignation of the prior distributions for the parameters is made in two stages. In the first stage, non-informative prior distributions are considered. Using the information provided by the first stage, more informative prior distributions are used in the second one. The theoretical development is applied to data provided by the monitoring network of Mexico City. The rate function that best fit the data varies according to the region of the city and/or threshold that is considered. In some cases the best fit is the Weibull form and in other cases the best option is the exponentiated-Weibull. Copyright (C) 2007 John Wiley & Sons, Ltd.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Item response theory (IRT) comprises a set of statistical models which are useful in many fields, especially when there is interest in studying latent variables. These latent variables are directly considered in the Item Response Models (IRM) and they are usually called latent traits. A usual assumption for parameter estimation of the IRM, considering one group of examinees, is to assume that the latent traits are random variables which follow a standard normal distribution. However, many works suggest that this assumption does not apply in many cases. Furthermore, when this assumption does not hold, the parameter estimates tend to be biased and misleading inference can be obtained. Therefore, it is important to model the distribution of the latent traits properly. In this paper we present an alternative latent traits modeling based on the so-called skew-normal distribution; see Genton (2004). We used the centred parameterization, which was proposed by Azzalini (1985). This approach ensures the model identifiability as pointed out by Azevedo et al. (2009b). Also, a Metropolis Hastings within Gibbs sampling (MHWGS) algorithm was built for parameter estimation by using an augmented data approach. A simulation study was performed in order to assess the parameter recovery in the proposed model and the estimation method, and the effect of the asymmetry level of the latent traits distribution on the parameter estimation. Also, a comparison of our approach with other estimation methods (which consider the assumption of symmetric normality for the latent traits distribution) was considered. The results indicated that our proposed algorithm recovers properly all parameters. Specifically, the greater the asymmetry level, the better the performance of our approach compared with other approaches, mainly in the presence of small sample sizes (number of examinees). Furthermore, we analyzed a real data set which presents indication of asymmetry concerning the latent traits distribution. The results obtained by using our approach confirmed the presence of strong negative asymmetry of the latent traits distribution. (C) 2010 Elsevier B.V. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The main object of this paper is to discuss the Bayes estimation of the regression coefficients in the elliptically distributed simple regression model with measurement errors. The posterior distribution for the line parameters is obtained in a closed form, considering the following: the ratio of the error variances is known, informative prior distribution for the error variance, and non-informative prior distributions for the regression coefficients and for the incidental parameters. We proved that the posterior distribution of the regression coefficients has at most two real modes. Situations with a single mode are more likely than those with two modes, especially in large samples. The precision of the modal estimators is studied by deriving the Hessian matrix, which although complicated can be computed numerically. The posterior mean is estimated by using the Gibbs sampling algorithm and approximations by normal distributions. The results are applied to a real data set and connections with results in the literature are reported. (C) 2011 Elsevier B.V. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This work presents a Bayesian semiparametric approach for dealing with regression models where the covariate is measured with error. Given that (1) the error normality assumption is very restrictive, and (2) assuming a specific elliptical distribution for errors (Student-t for example), may be somewhat presumptuous; there is need for more flexible methods, in terms of assuming only symmetry of errors (admitting unknown kurtosis). In this sense, the main advantage of this extended Bayesian approach is the possibility of considering generalizations of the elliptical family of models by using Dirichlet process priors in dependent and independent situations. Conditional posterior distributions are implemented, allowing the use of Markov Chain Monte Carlo (MCMC), to generate the posterior distributions. An interesting result shown is that the Dirichlet process prior is not updated in the case of the dependent elliptical model. Furthermore, an analysis of a real data set is reported to illustrate the usefulness of our approach, in dealing with outliers. Finally, semiparametric proposed models and parametric normal model are compared, graphically with the posterior distribution density of the coefficients. (C) 2009 Elsevier Inc. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We have considered a Bayesian approach for the nonlinear regression model by replacing the normal distribution on the error term by some skewed distributions, which account for both skewness and heavy tails or skewness alone. The type of data considered in this paper concerns repeated measurements taken in time on a set of individuals. Such multiple observations on the same individual generally produce serially correlated outcomes. Thus, additionally, our model does allow for a correlation between observations made from the same individual. We have illustrated the procedure using a data set to study the growth curves of a clinic measurement of a group of pregnant women from an obstetrics clinic in Santiago, Chile. Parameter estimation and prediction were carried out using appropriate posterior simulation schemes based in Markov Chain Monte Carlo methods. Besides the deviance information criterion (DIC) and the conditional predictive ordinate (CPO), we suggest the use of proper scoring rules based on the posterior predictive distribution for comparing models. For our data set, all these criteria chose the skew-t model as the best model for the errors. These DIC and CPO criteria are also validated, for the model proposed here, through a simulation study. As a conclusion of this study, the DIC criterion is not trustful for this kind of complex model.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

There are several versions of the lognormal distribution in the statistical literature, one is based in the exponential transformation of generalized normal distribution (GN). This paper presents the Bayesian analysis for the generalized lognormal distribution (logGN) considering independent non-informative Jeffreys distributions for the parameters as well as the procedure for implementing the Gibbs sampler to obtain the posterior distributions of parameters. The results are used to analyze failure time models with right-censored and uncensored data. The proposed method is illustrated using actual failure time data of computers.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The aim of this study was to determine how abiotic factors drive the phytoplankton community in a water supply reservoir within short sampling intervals. Samples were collected at the subsurface (0.1 m) and bottom of limnetic (8 m) and littoral (2 m) zones in both the dry and rainy seasons. The following abiotic variables were analyzed: water temperature, dissolved oxygen, electrical conductivity, total dissolved solids, turbidity, pH, total nitrogen, nitrite, nitrate, total phosphorus, total dissolved phosphorus and orthophosphate. Phytoplankton biomass was determined from biovolume values. The role abiotic variables play in the dynamics of phytoplankton species was determined by means of Canonical Correspondence Analysis. Algae biomass ranged from 1.17×10(4) to 9.21×10(4) µg.L-1; cyanobacteria had biomass values ranging from 1.07×10(4) to 8.21×10(4) µg.L-1. High availability of phosphorous, nitrogen limitation, alkaline pH and thermal stability all favored cyanobacteria blooms, particularly during the dry season. Temperature, pH, total phosphorous and turbidity were key factors in characterizing the phytoplankton community between sampling times and stations. Of the species studied, Cylindrospermopsis raciborskii populations were dominant in the phytoplankton in both the dry and rainy seasons. We conclude that the phytoplankton was strongly influenced by abiotic variables, particularly in relation to seasonal distribution patterns.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Some factors complicate comparisons between linkage maps from different studies. This problem can be resolved if measures of precision, such as confidence intervals and frequency distributions, are associated with markers. We examined the precision of distances and ordering of microsatellite markers in the consensus linkage maps of chromosomes 1, 3 and 4 from two F 2 reciprocal Brazilian chicken populations, using bootstrap sampling. Single and consensus maps were constructed. The consensus map was compared with the International Consensus Linkage Map and with the whole genome sequence. Some loci showed segregation distortion and missing data, but this did not affect the analyses negatively. Several inversions and position shifts were detected, based on 95% confidence intervals and frequency distributions of loci. Some discrepancies in distances between loci and in ordering were due to chance, whereas others could be attributed to other effects, including reciprocal crosses, sampling error of the founder animals from the two populations, F(2) population structure, number of and distance between microsatellite markers, number of informative meioses, loci segregation patterns, and sex. In the Brazilian consensus GGA1, locus LEI1038 was in a position closer to the true genome sequence than in the International Consensus Map, whereas for GGA3 and GGA4, no such differences were found. Extending these analyses to the remaining chromosomes should facilitate comparisons and the integration of several available genetic maps, allowing meta-analyses for map construction and quantitative trait loci (QTL) mapping. The precision of the estimates of QTL positions and their effects would be increased with such information.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: With nearly 1,100 species, the fish family Characidae represents more than half of the species of Characiformes, and is a key component of Neotropical freshwater ecosystems. The composition, phylogeny, and classification of Characidae is currently uncertain, despite significant efforts based on analysis of morphological and molecular data. No consensus about the monophyly of this group or its position within the order Characiformes has been reached, challenged by the fact that many key studies to date have non-overlapping taxonomic representation and focus only on subsets of this diversity. Results: In the present study we propose a new definition of the family Characidae and a hypothesis of relationships for the Characiformes based on phylogenetic analysis of DNA sequences of two mitochondrial and three nuclear genes (4,680 base pairs). The sequences were obtained from 211 samples representing 166 genera distributed among all 18 recognized families in the order Characiformes, all 14 recognized subfamilies in the Characidae, plus 56 of the genera so far considered incertae sedis in the Characidae. The phylogeny obtained is robust, with most lineages significantly supported by posterior probabilities in Bayesian analysis, and high bootstrap values from maximum likelihood and parsimony analyses. Conclusion: A monophyletic assemblage strongly supported in all our phylogenetic analysis is herein defined as the Characidae and includes the characiform species lacking a supraorbital bone and with a derived position of the emergence of the hyoid artery from the anterior ceratohyal. To recognize this and several other monophyletic groups within characiforms we propose changes in the limits of several families to facilitate future studies in the Characiformes and particularly the Characidae. This work presents a new phylogenetic framework for a speciose and morphologically diverse group of freshwater fishes of significant ecological and evolutionary importance across the Neotropics and portions of Africa.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Pollen counts from samples taken from storage pots throughout one year (from October to September) were adjusted by Tasei's volumetric correction coefficient for the determination of pollen sources exploited by two colonies of Nannotrigona testaceicornis in Sao Paulo, Brazil. The results obtained by this sampling technique for seven months (December to June) were compared with those from corbicula load samples taken within the same period. This species visited a large variety of plant species, but few of them were frequently used. As a rule, pollen sources that appeared at frequencies greater than 1% were found with both sampling methods and significant positive correlations (Spearman correlation coefficient) were found between their values. The pollen load sample data showed that N. testaceicornis gathered pollen throughout the external activity period.