144 resultados para Bayesian free-knot regression splines
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)
Resumo:
This paper addresses the investment decisions considering the presence of financial constraints of 373 large Brazilian firms from 1997 to 2004, using panel data. A Bayesian econometric model was used considering ridge regression for multicollinearity problems among the variables in the model. Prior distributions are assumed for the parameters, classifying the model into random or fixed effects. We used a Bayesian approach to estimate the parameters, considering normal and Student t distributions for the error and assumed that the initial values for the lagged dependent variable are not fixed, but generated by a random process. The recursive predictive density criterion was used for model comparisons. Twenty models were tested and the results indicated that multicollinearity does influence the value of the estimated parameters. Controlling for capital intensity, financial constraints are found to be more important for capital-intensive firms, probably due to their lower profitability indexes, higher fixed costs and higher degree of property diversification.
Resumo:
The main object of this paper is to discuss the Bayes estimation of the regression coefficients in the elliptically distributed simple regression model with measurement errors. The posterior distribution for the line parameters is obtained in a closed form, considering the following: the ratio of the error variances is known, informative prior distribution for the error variance, and non-informative prior distributions for the regression coefficients and for the incidental parameters. We proved that the posterior distribution of the regression coefficients has at most two real modes. Situations with a single mode are more likely than those with two modes, especially in large samples. The precision of the modal estimators is studied by deriving the Hessian matrix, which although complicated can be computed numerically. The posterior mean is estimated by using the Gibbs sampling algorithm and approximations by normal distributions. The results are applied to a real data set and connections with results in the literature are reported. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
The purpose of this paper is to develop a Bayesian analysis for nonlinear regression models under scale mixtures of skew-normal distributions. This novel class of models provides a useful generalization of the symmetrical nonlinear regression models since the error distributions cover both skewness and heavy-tailed distributions such as the skew-t, skew-slash and the skew-contaminated normal distributions. The main advantage of these class of distributions is that they have a nice hierarchical representation that allows the implementation of Markov chain Monte Carlo (MCMC) methods to simulate samples from the joint posterior distribution. In order to examine the robust aspects of this flexible class, against outlying and influential observations, we present a Bayesian case deletion influence diagnostics based on the Kullback-Leibler divergence. Further, some discussions on the model selection criteria are given. The newly developed procedures are illustrated considering two simulations study, and a real data previously analyzed under normal and skew-normal nonlinear regression models. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
We review several asymmetrical links for binary regression models and present a unified approach for two skew-probit links proposed in the literature. Moreover, under skew-probit link, conditions for the existence of the ML estimators and the posterior distribution under improper priors are established. The framework proposed here considers two sets of latent variables which are helpful to implement the Bayesian MCMC approach. A simulation study to criteria for models comparison is conducted and two applications are made. Using different Bayesian criteria we show that, for these data sets, the skew-probit links are better than alternative links proposed in the literature.
Resumo:
In this paper, we compare the performance of two statistical approaches for the analysis of data obtained from the social research area. In the first approach, we use normal models with joint regression modelling for the mean and for the variance heterogeneity. In the second approach, we use hierarchical models. In the first case, individual and social variables are included in the regression modelling for the mean and for the variance, as explanatory variables, while in the second case, the variance at level 1 of the hierarchical model depends on the individuals (age of the individuals), and in the level 2 of the hierarchical model, the variance is assumed to change according to socioeconomic stratum. Applying these methodologies, we analyze a Colombian tallness data set to find differences that can be explained by socioeconomic conditions. We also present some theoretical and empirical results concerning the two models. From this comparative study, we conclude that it is better to jointly modelling the mean and variance heterogeneity in all cases. We also observe that the convergence of the Gibbs sampling chain used in the Markov Chain Monte Carlo method for the jointly modeling the mean and variance heterogeneity is quickly achieved.
Resumo:
The purpose of this paper is to develop a Bayesian approach for log-Birnbaum-Saunders Student-t regression models under right-censored survival data. Markov chain Monte Carlo (MCMC) methods are used to develop a Bayesian procedure for the considered model. In order to attenuate the influence of the outlying observations on the parameter estimates, we present in this paper Birnbaum-Saunders models in which a Student-t distribution is assumed to explain the cumulative damage. Also, some discussions on the model selection to compare the fitted models are given and case deletion influence diagnostics are developed for the joint posterior distribution based on the Kullback-Leibler divergence. The developed procedures are illustrated with a real data set. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
We have considered a Bayesian approach for the nonlinear regression model by replacing the normal distribution on the error term by some skewed distributions, which account for both skewness and heavy tails or skewness alone. The type of data considered in this paper concerns repeated measurements taken in time on a set of individuals. Such multiple observations on the same individual generally produce serially correlated outcomes. Thus, additionally, our model does allow for a correlation between observations made from the same individual. We have illustrated the procedure using a data set to study the growth curves of a clinic measurement of a group of pregnant women from an obstetrics clinic in Santiago, Chile. Parameter estimation and prediction were carried out using appropriate posterior simulation schemes based in Markov Chain Monte Carlo methods. Besides the deviance information criterion (DIC) and the conditional predictive ordinate (CPO), we suggest the use of proper scoring rules based on the posterior predictive distribution for comparing models. For our data set, all these criteria chose the skew-t model as the best model for the errors. These DIC and CPO criteria are also validated, for the model proposed here, through a simulation study. As a conclusion of this study, the DIC criterion is not trustful for this kind of complex model.
Resumo:
Background: Coronary artery disease (CAD) is among the main causes of death in developed countries, and diet and lifestyle can influence CAD incidence. Objective: To evaluate the association of coronary artery disease risk score with dietary, anthropometric and biochemical components in adults clinically selected for a lifestyle modification program. Methods: 362 adults (96 men, 266 women, 53.9 +/- 9.4 years) fulfilled the inclusion criteria by presenting all the required data. The Framingham score was calculated and the IV Brazilian Guideline on Dyslipidemia and Prevention of Atherosclerosis was adopted for classification of the CAD risks. Anthropometric assessments included waist circumference (WC), body fat and calculated BMI (kg/m(2)) and muscle-mass index (MMI kg/m(2)). Dietary intake was estimated through 24 h dietary recall. Fasting blood was used for biochemical analysis. Metabolic Syndrome (MS) was diagnosed using NCEP-ATPIII (2001) criteria. Logistic regression was used to determine the odds of CAD risks according to the altered components of MS, dietary, anthropometric, and biochemical components. Results: For a sample with a BMI 28.5 +/- 5.0 kg/m(2) the association with lower risk (<10% CAD) were lower age (<60 years old), and plasma values of uric acid. The presence of MS within low, intermediary, and high CAD risk categories was 30.8%, 55.5%, and 69.8%, respectively. The independent risk factors associated with CAD risk score was MS and uric acid, and the protective factors were recommended intake of saturated fat and fiber and muscle mass index. Conclusion: Recommended intake of saturated fat and dietary fiber, together with proper muscle mass, are inversely associated with CAD risk score. On the other hand, the presence of MS and high plasma uric acid are associated with CAD risk score.
Resumo:
Objective: to identify risk factors associated with neonatal transfers from a free-standing birth centre to a hospital. Design: epidemiological case-control study. Setting: midwifery-led free-standing birth centre in Sao Paulo, Brazil. Participants: 96 newborns were selected from 2840 births between September 1998 and August 2005. Cases were defined as all new borns transferred from the birth centre to a hospital (n = 32), and controls were defined as new borns delivered at the same birth centre, during the same time period, and who had not been transferred to a hospital (n = 64). Measurements and findings: data were collected from medical records available at the birth centre. Univariate and multivariate analyses were performed using logistic regression. The multivariate analysis included outcomes with p<0.25, specifically: smoking during pregnancy, prenatal care appointments, labour complications, weight in relation to gestational age, and one-minute Apgar score. Of the foregoing outcomes, those that remained in the full regression model as a risk factor associated with neonatal transfer were: smoking during pregnancy [p = 0.009, odds ratio (OR) = 4.1,95% confidence interval (CI) 1.03-16.33], labour complications (p<0.001, OR = 5.5, 95% CI 1.06-28.26) and one-minute Apgar score <= 7 (p<0.001, OR = 7.8,95% CI 1.62-37.03). Key conclusions and implications for practice: smoking during pregnancy, labour complications and one-minute Apgar score <= 7 were confirmed as risk factors for neonatal transfer from the birth centre to a hospital. The identified risk factors can help to improve institutional protocols and formulate hypotheses for other studies. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
Joint generalized linear models and double generalized linear models (DGLMs) were designed to model outcomes for which the variability can be explained using factors and/or covariates. When such factors operate, the usual normal regression models, which inherently exhibit constant variance, will under-represent variation in the data and hence may lead to erroneous inferences. For count and proportion data, such noise factors can generate a so-called overdispersion effect, and the use of binomial and Poisson models underestimates the variability and, consequently, incorrectly indicate significant effects. In this manuscript, we propose a DGLM from a Bayesian perspective, focusing on the case of proportion data, where the overdispersion can be modeled using a random effect that depends on some noise factors. The posterior joint density function was sampled using Monte Carlo Markov Chain algorithms, allowing inferences over the model parameters. An application to a data set on apple tissue culture is presented, for which it is shown that the Bayesian approach is quite feasible, even when limited prior information is available, thereby generating valuable insight for the researcher about its experimental results.
Resumo:
A total of 152,145 weekly test-day milk yield records from 7317 first lactations of Holstein cows distributed in 93 herds in southeastern Brazil were analyzed. Test-day milk yields were classified into 44 weekly classes of DIM. The contemporary groups were defined as herd-year-week of test-day. The model included direct additive genetic, permanent environmental and residual effects as random and fixed effects of contemporary group and age of cow at calving as covariable, linear and quadratic effects. Mean trends were modeled by a cubic regression on orthogonal polynomials of DIM. Additive genetic and permanent environmental random effects were estimated by random regression on orthogonal Legendre polynomials. Residual variances were modeled using third to seventh-order variance functions or a step function with 1, 6,13,17 and 44 variance classes. Results from Akaike`s and Schwarz`s Bayesian information criterion suggested that a model considering a 7th-order Legendre polynomial for additive effect, a 12th-order polynomial for permanent environment effect and a step function with 6 classes for residual variances, fitted best. However, a parsimonious model, with a 6th-order Legendre polynomial for additive effects and a 7th-order polynomial for permanent environmental effects, yielded very similar genetic parameter estimates. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
Fecal samples and behavioral data were collected at a fortnightly basis during 11 months period from free-living male American kestrels living in southeast Brazil (22 degrees S latitude). The aim was to investigate the seasonal changes in testicular and adrenal steroidogenic activity and their correlation to reproductive behaviors and environmental factors. The results revealed that monthly mean of fecal glucocorticoid metabolites in May and June were higher than those estimated in November. in parallel, monthly mean of androgen metabolites in September was higher than those from January to April and from October to November. Molt took place from January to March, whereas copulation was observed from June to October but peaked in September. Nest activity and food transfer to females occurred predominantly in October, and parental behavior was noticed only in November. Territorial aggressions were rare and scattered throughout the year. Multiple regression analysis revealed that fecal androgen levels are predicted by photoperiod and copulation, while fecal glucocorticoid levels are only predicted by photoperiod. Bivariate correlations showed that fecal androgen metabolites were positively correlated with fecal glucocorticoid metabolites and copulation, but negatively correlated with molt. Additionally, copulation was positively correlated with food transfer to females and nest activity, but negatively correlated with molt. These findings suggest that male American kestrels living in southeast Brazil exhibit significant seasonal changes in fecal androgen and glucocorticoid concentrations, which seem to be stimulated by decreasing daylength but not by rainfall or temperature. (C) 2009 Elsevier Inc. All rights reserved.
Resumo:
In this article, we present a generalization of the Bayesian methodology introduced by Cepeda and Gamerman (2001) for modeling variance heterogeneity in normal regression models where we have orthogonality between mean and variance parameters to the general case considering both linear and highly nonlinear regression models. Under the Bayesian paradigm, we use MCMC methods to simulate samples for the joint posterior distribution. We illustrate this algorithm considering a simulated data set and also considering a real data set related to school attendance rate for children in Colombia. Finally, we present some extensions of the proposed MCMC algorithm.
Resumo:
In this paper, we introduce a Bayesian analysis for survival multivariate data in the presence of a covariate vector and censored observations. Different ""frailties"" or latent variables are considered to capture the correlation among the survival times for the same individual. We assume Weibull or generalized Gamma distributions considering right censored lifetime data. We develop the Bayesian analysis using Markov Chain Monte Carlo (MCMC) methods.
Resumo:
Nesse artigo, tem-se o interesse em avaliar diferentes estratégias de estimação de parâmetros para um modelo de regressão linear múltipla. Para a estimação dos parâmetros do modelo foram utilizados dados de um ensaio clínico em que o interesse foi verificar se o ensaio mecânico da propriedade de força máxima (EM-FM) está associada com a massa femoral, com o diâmetro femoral e com o grupo experimental de ratas ovariectomizadas da raça Rattus norvegicus albinus, variedade Wistar. Para a estimação dos parâmetros do modelo serão comparadas três metodologias: a metodologia clássica, baseada no método dos mínimos quadrados; a metodologia Bayesiana, baseada no teorema de Bayes; e o método Bootstrap, baseado em processos de reamostragem.