47 resultados para regression discrete models
Resumo:
This article presents important properties of standard discrete distributions and its conjugate densities. The Bernoulli and Poisson processes are described as generators of such discrete models. A characterization of distributions by mixtures is also introduced. This article adopts a novel singular notation and representation. Singular representations are unusual in statistical texts. Nevertheless, the singular notation makes it simpler to extend and generalize theoretical results and greatly facilitates numerical and computational implementation.
Resumo:
The Brazilian Osteoporosis Study (BRAZOS) is the first epidemiological study carried out in a representative sample of Brazilian men and women aged 40 years or older. The prevalence of fragility fractures is about 15.1% in the women and 12.8% in the men. Moreover, advanced age, sedentarism, family history of hip fracture, current smoking, recurrent falls, diabetes mellitus and poor quality of life are the main clinical risk factors associated with fragility fractures. The Brazilian Osteoporosis Study (BRAZOS) is the first epidemiological study carried out in a representative sample of Brazilian men and women aged 40 years or older with the purpose of identifying the prevalence and the main clinical risk factors (CRF) associated with osteoporotic fracture in our population. A total of 2,420 individuals (women, 70%) from 150 different cities in the five geographic regions in Brazil, and all different socio-economical classes were selected to participate in the present survey. Anthropometrical data as well as life habits, fracture history, food intake, physical activity, falls and quality of life were determined by individual quantitative interviews. The representative sampling was based on Brazilian National data provided by the 2000 and 2003 census. Low trauma fracture was defined as that resulting of a fall from standing height or less in individuals 50 years or older at specific skeletal sites: forearm, femur, ribs, vertebra and humerus. Sampling error was 2.2% with 95% confidence intervals. Logistic regression analysis models were designed having the fragility fracture as the dependent variable and all other parameters as the independent variable. Significance level was set as p < 0.05. The average of age, height and weight for men and women were 58.4 +/- 12.8 and 60.1 +/- 13.7 years, 1.67 +/- 0.08 and 1.56 +/- 0.07 m and 73.3 +/- 14.7 and 64.7 +/- 13.7 kg, respectively. About 15.1% of the women and 12.8% of the men reported fragility fractures. In the women, the main CRF associated with fractures were advanced age (OR = 1.6; 95% CI 1.06-2.4), family history of hip fracture (OR = 1.7; 95% CI 1.1-2.8), early menopause (OR = 1.7; 95% CI 1.02-2.9), sedentary lifestyle (OR = 1.6; 95% CI 1.02-2.7), poor quality of life (OR = 1.9; 95% CI 1.2-2.9), higher intake of phosphorus (OR = 1.9; 95% CI 1.2-2.9), diabetes mellitus (OR = 2.8; 95% CI 1.01-8.2), use of benzodiazepine drugs (OR = 2.0; 95% CI 1.1-3.6) and recurrent falls (OR = 2.4; 95% CI 1.2-5.0). In the men, the main CRF were poor quality of life (OR = 3.2; 95% CI 1.7-6.1), current smoking (OR = 3.5; 95% CI 1.28-9.77), diabetes mellitus (OR = 4.2; 95% CI 1.27-13.7) and sedentary lifestyle (OR = 6.3; 95% CI 1.1-36.1). Our findings suggest that CRF may contribute as an important tool to identify men and women with higher risk of osteoporotic fractures and that interventions aiming at specific risk factors (quit smoking, regular physical activity, prevention of falls) may help to manage patients to reduce their risk of fracture.
Resumo:
In this article, we present a generalization of the Bayesian methodology introduced by Cepeda and Gamerman (2001) for modeling variance heterogeneity in normal regression models where we have orthogonality between mean and variance parameters to the general case considering both linear and highly nonlinear regression models. Under the Bayesian paradigm, we use MCMC methods to simulate samples for the joint posterior distribution. We illustrate this algorithm considering a simulated data set and also considering a real data set related to school attendance rate for children in Colombia. Finally, we present some extensions of the proposed MCMC algorithm.
Resumo:
In this paper, we compare the performance of two statistical approaches for the analysis of data obtained from the social research area. In the first approach, we use normal models with joint regression modelling for the mean and for the variance heterogeneity. In the second approach, we use hierarchical models. In the first case, individual and social variables are included in the regression modelling for the mean and for the variance, as explanatory variables, while in the second case, the variance at level 1 of the hierarchical model depends on the individuals (age of the individuals), and in the level 2 of the hierarchical model, the variance is assumed to change according to socioeconomic stratum. Applying these methodologies, we analyze a Colombian tallness data set to find differences that can be explained by socioeconomic conditions. We also present some theoretical and empirical results concerning the two models. From this comparative study, we conclude that it is better to jointly modelling the mean and variance heterogeneity in all cases. We also observe that the convergence of the Gibbs sampling chain used in the Markov Chain Monte Carlo method for the jointly modeling the mean and variance heterogeneity is quickly achieved.
Resumo:
The purpose of this paper is to develop a Bayesian analysis for nonlinear regression models under scale mixtures of skew-normal distributions. This novel class of models provides a useful generalization of the symmetrical nonlinear regression models since the error distributions cover both skewness and heavy-tailed distributions such as the skew-t, skew-slash and the skew-contaminated normal distributions. The main advantage of these class of distributions is that they have a nice hierarchical representation that allows the implementation of Markov chain Monte Carlo (MCMC) methods to simulate samples from the joint posterior distribution. In order to examine the robust aspects of this flexible class, against outlying and influential observations, we present a Bayesian case deletion influence diagnostics based on the Kullback-Leibler divergence. Further, some discussions on the model selection criteria are given. The newly developed procedures are illustrated considering two simulations study, and a real data previously analyzed under normal and skew-normal nonlinear regression models. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
The purpose of this paper is to develop a Bayesian approach for log-Birnbaum-Saunders Student-t regression models under right-censored survival data. Markov chain Monte Carlo (MCMC) methods are used to develop a Bayesian procedure for the considered model. In order to attenuate the influence of the outlying observations on the parameter estimates, we present in this paper Birnbaum-Saunders models in which a Student-t distribution is assumed to explain the cumulative damage. Also, some discussions on the model selection to compare the fitted models are given and case deletion influence diagnostics are developed for the joint posterior distribution based on the Kullback-Leibler divergence. The developed procedures are illustrated with a real data set. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
In this paper, the generalized log-gamma regression model is modified to allow the possibility that long-term survivors may be present in the data. This modification leads to a generalized log-gamma regression model with a cure rate, encompassing, as special cases, the log-exponential, log-Weibull and log-normal regression models with a cure rate typically used to model such data. The models attempt to simultaneously estimate the effects of explanatory variables on the timing acceleration/deceleration of a given event and the surviving fraction, that is, the proportion of the population for which the event never occurs. The normal curvatures of local influence are derived under some usual perturbation schemes and two martingale-type residuals are proposed to assess departures from the generalized log-gamma error assumption as well as to detect outlying observations. Finally, a data set from the medical area is analyzed.
Resumo:
We introduce in this paper a new class of discrete generalized nonlinear models to extend the binomial, Poisson and negative binomial models to cope with count data. This class of models includes some important models such as log-nonlinear models, logit, probit and negative binomial nonlinear models, generalized Poisson and generalized negative binomial regression models, among other models, which enables the fitting of a wide range of models to count data. We derive an iterative process for fitting these models by maximum likelihood and discuss inference on the parameters. The usefulness of the new class of models is illustrated with an application to a real data set. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
In survival analysis applications, the failure rate function may frequently present a unimodal shape. In such case, the log-normal or log-logistic distributions are used. In this paper, we shall be concerned only with parametric forms, so a location-scale regression model based on the Burr XII distribution is proposed for modeling data with a unimodal failure rate function as an alternative to the log-logistic regression model. Assuming censored data, we consider a classic analysis, a Bayesian analysis and a jackknife estimator for the parameters of the proposed model. For different parameter settings, sample sizes and censoring percentages, various simulation studies are performed and compared to the performance of the log-logistic and log-Burr XII regression models. Besides, we use sensitivity analysis to detect influential or outlying observations, and residual analysis is used to check the assumptions in the model. Finally, we analyze a real data set under log-Buff XII regression models. (C) 2008 Published by Elsevier B.V.
Resumo:
The absorption spectrum of the acid form of pterin in water was investigated theoretically. Different procedures using continuum, discrete, and explicit models were used to include the solvation effect on the absorption spectrum, characterized by two bands. The discrete and explicit models used Monte Carlo simulation to generate the liquid structure and time-dependent density functional theory (B3LYP/6-31G+(d)) to obtain the excitation energies. The discrete model failed to give the correct qualitative effect on the second absorption band. The continuum model, in turn, has given a correct qualitative picture and a semiquantitative description. The explicit use of 29 solvent molecules, forming a hydration shell of 6 angstrom, embedded in the electrostatic field of the remaining solvent molecules, gives absorption transitions at 3.67 and 4.59 eV in excellent agreement with the S(0)-S(1) and S(0)-S(2) absorption bands at of 3.66 and 4.59 eV, respectively, that characterize the experimental spectrum of pterin in water environment. (C) 2010 Wiley Periodicals, Inc. Int J Quantum Chem 110: 2371-2377, 2010
Resumo:
In this article, we compare three residuals based on the deviance component in generalised log-gamma regression models with censored observations. For different parameter settings, sample sizes and censoring percentages, various simulation studies are performed and the empirical distribution of each residual is displayed and compared with the standard normal distribution. For all cases studied, the empirical distributions of the proposed residuals are in general symmetric around zero, but only a martingale-type residual presented negligible kurtosis for the majority of the cases studied. These studies suggest that the residual analysis usually performed in normal linear regression models can be straightforwardly extended for the martingale-type residual in generalised log-gamma regression models with censored data. A lifetime data set is analysed under log-gamma regression models and a model checking based on the martingale-type residual is performed.
Resumo:
The class of symmetric linear regression models has the normal linear regression model as a special case and includes several models that assume that the errors follow a symmetric distribution with longer-than-normal tails. An important member of this class is the t linear regression model, which is commonly used as an alternative to the usual normal regression model when the data contain extreme or outlying observations. In this article, we develop second-order asymptotic theory for score tests in this class of models. We obtain Bartlett-corrected score statistics for testing hypotheses on the regression and the dispersion parameters. The corrected statistics have chi-squared distributions with errors of order O(n(-3/2)), n being the sample size. The corrections represent an improvement over the corresponding original Rao`s score statistics, which are chi-squared distributed up to errors of order O(n(-1)). Simulation results show that the corrected score tests perform much better than their uncorrected counterparts in samples of small or moderate size.
Resumo:
We present simple matrix formulae for corrected score statistics in symmetric nonlinear regression models. The corrected score statistics follow more closely a chi (2) distribution than the classical score statistic. Our simulation results indicate that the corrected score tests display smaller size distortions than the original score test. We also compare the sizes and the powers of the corrected score tests with bootstrap-based score tests.
Resumo:
The main object of this paper is to discuss the Bayes estimation of the regression coefficients in the elliptically distributed simple regression model with measurement errors. The posterior distribution for the line parameters is obtained in a closed form, considering the following: the ratio of the error variances is known, informative prior distribution for the error variance, and non-informative prior distributions for the regression coefficients and for the incidental parameters. We proved that the posterior distribution of the regression coefficients has at most two real modes. Situations with a single mode are more likely than those with two modes, especially in large samples. The precision of the modal estimators is studied by deriving the Hessian matrix, which although complicated can be computed numerically. The posterior mean is estimated by using the Gibbs sampling algorithm and approximations by normal distributions. The results are applied to a real data set and connections with results in the literature are reported. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
In this article, we introduce a semi-parametric Bayesian approach based on Dirichlet process priors for the discrete calibration problem in binomial regression models. An interesting topic is the dosimetry problem related to the dose-response model. A hierarchical formulation is provided so that a Markov chain Monte Carlo approach is developed. The methodology is applied to simulated and real data.