168 resultados para Bayesian approach
Resumo:
In this paper, we present a Bayesian approach for estimation in the skew-normal calibration model, as well as the conditional posterior distributions which are useful for implementing the Gibbs sampler. Data transformation is thus avoided by using the methodology proposed. Model fitting is implemented by proposing the asymmetric deviance information criterion, ADIC, a modification of the ordinary DIC. We also report an application of the model studied by using a real data set, related to the relationship between the resistance and the elasticity of a sample of concrete beams. Copyright (C) 2008 John Wiley & Sons, Ltd.
Resumo:
Diagnostic methods have been an important tool in regression analysis to detect anomalies, such as departures from error assumptions and the presence of outliers and influential observations with the fitted models. Assuming censored data, we considered a classical analysis and Bayesian analysis assuming no informative priors for the parameters of the model with a cure fraction. A Bayesian approach was considered by using Markov Chain Monte Carlo Methods with Metropolis-Hasting algorithms steps to obtain the posterior summaries of interest. Some influence methods, such as the local influence, total local influence of an individual, local influence on predictions and generalized leverage were derived, analyzed and discussed in survival data with a cure fraction and covariates. The relevance of the approach was illustrated with a real data set, where it is shown that, by removing the most influential observations, the decision about which model best fits the data is changed.
Resumo:
Survival or longevity is an economically important trait in beef cattle. The main inconvenience for its inclusion in selection criteria is delayed recording of phenotypic data and the high computational demand for including survival in proportional hazard models. Thus, identification of a longevity-correlated trait that could be recorded early in life would be very useful for selection purposes. We estimated the genetic relationship of survival with productive and reproductive traits in Nellore cattle, including weaning weight (WW), post-weaning growth (PWG), muscularity (MUSC), scrotal circumference at 18 months (SC18), and heifer pregnancy (HP). Survival was measured in discrete time intervals and modeled through a sequential threshold model. Five independent bivariate Bayesian analyses were performed, accounting for cow survival and the five productive and reproductive traits. Posterior mean estimates for heritability (standard deviation in parentheses) were 0.55 (0.01) for WW, 0.25 (0.01) for PWG, 0.23 (0.01) for MUSC, and 0.48 (0.01) for SC18. The posterior mean estimates (95% confidence interval in parentheses) for the genetic correlation with survival were 0.16 (0.13-0.19), 0.30 (0.25-0.34), 0.31 (0.25-0.36), 0.07 (0.02-0.12), and 0.82 (0.78-0.86) for WW, PWG, MUSC, SC18, and HP, respectively. Based on the high genetic correlation and heritability (0.54) posterior mean estimates for HP, the expected progeny difference for HP can be used to select bulls for longevity, as well as for post-weaning gain and muscle score.
Resumo:
This paper analyses the presence of financial constraint in the investment decisions of 367 Brazilian firms from 1997 to 2004, using a Bayesian econometric model with group-varying parameters. The motivation for this paper is the use of clustering techniques to group firms in a totally endogenous form. In order to classify the firms we used a hybrid clustering method, that is, hierarchical and non-hierarchical clustering techniques jointly. To estimate the parameters a Bayesian approach was considered. Prior distributions were assumed for the parameters, classifying the model in random or fixed effects. Ordinate predictive density criterion was used to select the model providing a better prediction. We tested thirty models and the better prediction considers the presence of 2 groups in the sample, assuming the fixed effect model with a Student t distribution with 20 degrees of freedom for the error. The results indicate robustness in the identification of financial constraint when the firms are classified by the clustering techniques. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
In many mammals social organization promotes genetic structuring, which can be influenced by the dispersal pattern of the species. We analyzed the population genetic structure and dispersal of white-lipped peccaries (Tayassu pecan) from the Pantanal, Brazil. We genotyped 100 individuals at 7 microsatellite loci from 2 adjacent locations with no obvious geographic barrier between them. We found a significant but low F(ST) value, and the Bayesian analysis indicated a unique cluster. No significant differences were observed between mean assignment indices of resident males and females from both locations, and the probability of being born at the location sampled of > 30% of the individuals analyzed was lower than average. Mean relatedness between resident female, male, and opposite-sex pairs was not statistically different in both locations. These results suggest a low degree of genetic differentiation between the locations analyzed, and dispersal by both sexes (contrary to the predicted male-biased dispersal of most mammalian species).
Resumo:
In this paper, we consider some non-homogeneous Poisson models to estimate the probability that an air quality standard is exceeded a given number of times in a time interval of interest. We assume that the number of exceedances occurs according to a non-homogeneous Poisson process (NHPP). This Poisson process has rate function lambda(t), t >= 0, which depends on some parameters that must be estimated. We take into account two cases of rate functions: the Weibull and the Goel-Okumoto. We consider models with and without change-points. When the presence of change-points is assumed, we may have the presence of either one, two or three change-points, depending of the data set. The parameters of the rate functions are estimated using a Gibbs sampling algorithm. Results are applied to ozone data provided by the Mexico City monitoring network. In a first instance, we assume that there are no change-points present. Depending on the adjustment of the model, we assume the presence of either one, two or three change-points. Copyright (C) 2009 John Wiley & Sons, Ltd.
Resumo:
In this paper we make use of some stochastic volatility models to analyse the behaviour of a weekly ozone average measurements series. The models considered here have been used previously in problems related to financial time series. Two models are considered and their parameters are estimated using a Bayesian approach based on Markov chain Monte Carlo (MCMC) methods. Both models are applied to the data provided by the monitoring network of the Metropolitan Area of Mexico City. The selection of the best model for that specific data set is performed using the Deviance Information Criterion and the Conditional Predictive Ordinate method.
Resumo:
In this paper we have discussed inference aspects of the skew-normal nonlinear regression models following both, a classical and Bayesian approach, extending the usual normal nonlinear regression models. The univariate skew-normal distribution that will be used in this work was introduced by Sahu et al. (Can J Stat 29:129-150, 2003), which is attractive because estimation of the skewness parameter does not present the same degree of difficulty as in the case with Azzalini (Scand J Stat 12:171-178, 1985) one and, moreover, it allows easy implementation of the EM-algorithm. As illustration of the proposed methodology, we consider a data set previously analyzed in the literature under normality.
Resumo:
Considering the Wald, score, and likelihood ratio asymptotic test statistics, we analyze a multivariate null intercept errors-in-variables regression model, where the explanatory and the response variables are subject to measurement errors, and a possible structure of dependency between the measurements taken within the same individual are incorporated, representing a longitudinal structure. This model was proposed by Aoki et al. (2003b) and analyzed under the bayesian approach. In this article, considering the classical approach, we analyze asymptotic test statistics and present a simulation study to compare the behavior of the three test statistics for different sample sizes, parameter values and nominal levels of the test. Also, closed form expressions for the score function and the Fisher information matrix are presented. We consider two real numerical illustrations, the odontological data set from Hadgu and Koch (1999), and a quality control data set.
Resumo:
A simultaneous optimization strategy based on a neuro-genetic approach is proposed for selection of laser induced breakdown spectroscopy operational conditions for the simultaneous determination of macronutrients (Ca, Mg and P), micro-nutrients (B, Cu, Fe, Mn and Zn), Al and Si in plant samples. A laser induced breakdown spectroscopy system equipped with a 10 Hz Q-switched Nd:YAG laser (12 ns, 532 nm, 140 mJ) and an Echelle spectrometer with intensified coupled-charge device was used. Integration time gate, delay time, amplification gain and number of pulses were optimized. Pellets of spinach leaves (NIST 1570a) were employed as laboratory samples. In order to find a model that could correlate laser induced breakdown spectroscopy operational conditions with compromised high peak areas of all elements simultaneously, a Bayesian Regularized Artificial Neural Network approach was employed. Subsequently, a genetic algorithm was applied to find optimal conditions for the neural network model, in an approach called neuro-genetic, A single laser induced breakdown spectroscopy working condition that maximizes peak areas of all elements simultaneously, was obtained with the following optimized parameters: 9.0 mu s integration time gate, 1.1 mu s delay time, 225 (a.u.) amplification gain and 30 accumulated laser pulses. The proposed approach is a useful and a suitable tool for the optimization process of such a complex analytical problem. (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
Survival models involving frailties are commonly applied in studies where correlated event time data arise due to natural or artificial clustering. In this paper we present an application of such models in the animal breeding field. Specifically, a mixed survival model with a multivariate correlated frailty term is proposed for the analysis of data from over 3611 Brazilian Nellore cattle. The primary aim is to evaluate parental genetic effects on the trait length in days that their progeny need to gain a commercially specified standard weight gain. This trait is not measured directly but can be estimated from growth data. Results point to the importance of genetic effects and suggest that these models constitute a valuable data analysis tool for beef cattle breeding.
Resumo:
Hepatitis B is a worldwide health problem affecting about 2 billion people and more than 350 million are chronic carriers of the virus. Nine HBV genotypes (A to I) have been described. The geographical distribution of HBV genotypes is not completely understood due to the limited number of samples from some parts of the world. One such example is Colombia, in which few studies have described the HBV genotypes. In this study, we characterized HBV genotypes in 143 HBsAg-positive volunteer blood donors from Colombia. A fragment of 1306 bp partially comprising HBsAg and the DNA polymerase coding regions (S/POL) was amplified and sequenced. Bayesian phylogenetic analyses were conducted using the Markov Chain Monte Carlo (MCMC) approach to obtain the maximum clade credibility (MCC) tree using BEAST v.1.5.3. Of all samples, 68 were positive and 52 were successfully sequenced. Genotype F was the most prevalent in this population (77%) - subgenotypes F3 (75%) and Fib (2%). Genotype G (7.7%) and subgenotype A2 (15.3%) were also found. Genotype G sequence analysis suggests distinct introductions of this genotype in the country. Furthermore, we estimated the time of the most recent common ancestor (TMRCA) for each HBV/F subgenotype and also for Colombian F3 sequences using two different datasets: (i) 77 sequences comprising 1306 bp of S/POL region and (ii) 283 sequences comprising 681 bp of S/POL region. We also used two other previously estimated evolutionary rates: (i) 2.60 x 10(-4) s/s/y and (ii) 1.5 x 10(-5) s/s/y. Here we report the HBV genotypes circulating in Colombia and estimated the TMRCA for the four different subgenotypes of genotype F. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
GB virus C/hepatitis G (GBV-C) is an RNA virus of the family Flaviviridae. Despite replicating with an RNA-dependent RNA polymerase, some previous estimates of rates of evolutionary change in GBV-C suggest that it fixes mutations at the anomalously low rate of similar to 100(-7) nucleotide substitution per site, per year. However, these estimates were largely based on the assumption that GBV-C and its close relative GBV-A (New World monkey GB viruses) codiverged with their primate hosts over millions of years. Herein, we estimated the substitution rate of GBV-C using the largest set of dated GBV-C isolates compiled to date and a Bayesian coalescent approach that utilizes the year of sampling and so is independent of the assumption of codivergence. This revealed a rate of evolutionary change approximately four orders of magnitude higher than that estimated previously, in the range of 10(-2) to 10(-3) sub/site/year, and hence in line with those previously determined for RNA viruses in general and the Flaviviridae in particular. In addition, we tested the assumption of host-virus codivergence in GBV-A by performing a reconciliation analysis of host and virus phylogenies. Strikingly, we found no statistical evidence for host-virus codivergence in GBV-A, indicating that substitution rates in the GB viruses should not be estimated from host divergence times.
Resumo:
This work proposes and discusses an approach for inducing Bayesian classifiers aimed at balancing the tradeoff between the precise probability estimates produced by time consuming unrestricted Bayesian networks and the computational efficiency of Naive Bayes (NB) classifiers. The proposed approach is based on the fundamental principles of the Heuristic Search Bayesian network learning. The Markov Blanket concept, as well as a proposed ""approximate Markov Blanket"" are used to reduce the number of nodes that form the Bayesian network to be induced from data. Consequently, the usually high computational cost of the heuristic search learning algorithms can be lessened, while Bayesian network structures better than NB can be achieved. The resulting algorithms, called DMBC (Dynamic Markov Blanket Classifier) and A-DMBC (Approximate DMBC), are empirically assessed in twelve domains that illustrate scenarios of particular interest. The obtained results are compared with NB and Tree Augmented Network (TAN) classifiers, and confinn that both proposed algorithms can provide good classification accuracies and better probability estimates than NB and TAN, while being more computationally efficient than the widely used K2 Algorithm.
Resumo:
Item response theory (IRT) comprises a set of statistical models which are useful in many fields, especially when there is interest in studying latent variables. These latent variables are directly considered in the Item Response Models (IRM) and they are usually called latent traits. A usual assumption for parameter estimation of the IRM, considering one group of examinees, is to assume that the latent traits are random variables which follow a standard normal distribution. However, many works suggest that this assumption does not apply in many cases. Furthermore, when this assumption does not hold, the parameter estimates tend to be biased and misleading inference can be obtained. Therefore, it is important to model the distribution of the latent traits properly. In this paper we present an alternative latent traits modeling based on the so-called skew-normal distribution; see Genton (2004). We used the centred parameterization, which was proposed by Azzalini (1985). This approach ensures the model identifiability as pointed out by Azevedo et al. (2009b). Also, a Metropolis Hastings within Gibbs sampling (MHWGS) algorithm was built for parameter estimation by using an augmented data approach. A simulation study was performed in order to assess the parameter recovery in the proposed model and the estimation method, and the effect of the asymmetry level of the latent traits distribution on the parameter estimation. Also, a comparison of our approach with other estimation methods (which consider the assumption of symmetric normality for the latent traits distribution) was considered. The results indicated that our proposed algorithm recovers properly all parameters. Specifically, the greater the asymmetry level, the better the performance of our approach compared with other approaches, mainly in the presence of small sample sizes (number of examinees). Furthermore, we analyzed a real data set which presents indication of asymmetry concerning the latent traits distribution. The results obtained by using our approach confirmed the presence of strong negative asymmetry of the latent traits distribution. (C) 2010 Elsevier B.V. All rights reserved.