878 resultados para regression discrete models
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
We consider model selection uncertainty in linear regression. We study theoretically and by simulation the approach of Buckland and co-workers, who proposed estimating a parameter common to all models under study by taking a weighted average over the models, using weights obtained from information criteria or the bootstrap. This approach is compared with the usual approach in which the 'best' model is used, and with Bayesian model averaging. The weighted predictor behaves similarly to model averaging, with generally more realistic mean-squared errors than the usual model-selection-based estimator.
Resumo:
This paper addresses the investment decisions considering the presence of financial constraints of 373 large Brazilian firms from 1997 to 2004, using panel data. A Bayesian econometric model was used considering ridge regression for multicollinearity problems among the variables in the model. Prior distributions are assumed for the parameters, classifying the model into random or fixed effects. We used a Bayesian approach to estimate the parameters, considering normal and Student t distributions for the error and assumed that the initial values for the lagged dependent variable are not fixed, but generated by a random process. The recursive predictive density criterion was used for model comparisons. Twenty models were tested and the results indicated that multicollinearity does influence the value of the estimated parameters. Controlling for capital intensity, financial constraints are found to be more important for capital-intensive firms, probably due to their lower profitability indexes, higher fixed costs and higher degree of property diversification.
Resumo:
Environmental data are spatial, temporal, and often come with many zeros. In this paper, we included space–time random effects in zero-inflated Poisson (ZIP) and ‘hurdle’ models to investigate haulout patterns of harbor seals on glacial ice. The data consisted of counts, for 18 dates on a lattice grid of samples, of harbor seals hauled out on glacial ice in Disenchantment Bay, near Yakutat, Alaska. A hurdle model is similar to a ZIP model except it does not mix zeros from the binary and count processes. Both models can be used for zero-inflated data, and we compared space–time ZIP and hurdle models in a Bayesian hierarchical model. Space–time ZIP and hurdle models were constructed by using spatial conditional autoregressive (CAR) models and temporal first-order autoregressive (AR(1)) models as random effects in ZIP and hurdle regression models. We created maps of smoothed predictions for harbor seal counts based on ice density, other covariates, and spatio-temporal random effects. For both models predictions around the edges appeared to be positively biased. The linex loss function is an asymmetric loss function that penalizes overprediction more than underprediction, and we used it to correct for prediction bias to get the best map for space–time ZIP and hurdle models.
Resumo:
The study introduces a new regression model developed to estimate the hourly values of diffuse solar radiation at the surface. The model is based on the clearness index and diffuse fraction relationship, and includes the effects of cloud (cloudiness and cloud type), traditional meteorological variables (air temperature, relative humidity and atmospheric pressure observed at the surface) and air pollution (concentration of particulate matter observed at the surface). The new model is capable of predicting hourly values of diffuse solar radiation better than the previously developed ones (R-2 = 0.93 and RMSE = 0.085). A simple version with a large applicability is proposed that takes into consideration cloud effects only (cloudiness and cloud height) and shows a R-2 = 0.92. (C) 2011 Elsevier Ltd. All rights reserved.
Resumo:
In this paper, we propose a cure rate survival model by assuming the number of competing causes of the event of interest follows the Geometric distribution and the time to event follow a Birnbaum Saunders distribution. We consider a frequentist analysis for parameter estimation of a Geometric Birnbaum Saunders model with cure rate. Finally, to analyze a data set from the medical area. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
Estimates of evapotranspiration on a local scale is important information for agricultural and hydrological practices. However, equations to estimate potential evapotranspiration based only on temperature data, which are simple to use, are usually less trustworthy than the Food and Agriculture Organization (FAO)Penman-Monteith standard method. The present work describes two correction procedures for potential evapotranspiration estimates by temperature, making the results more reliable. Initially, the standard FAO-Penman-Monteith method was evaluated with a complete climatologic data set for the period between 2002 and 2006. Then temperature-based estimates by Camargo and Jensen-Haise methods have been adjusted by error autocorrelation evaluated in biweekly and monthly periods. In a second adjustment, simple linear regression was applied. The adjusted equations have been validated with climatic data available for the Year 2001. Both proposed methodologies showed good agreement with the standard method indicating that the methodology can be used for local potential evapotranspiration estimates.
Resumo:
The issue of assessing variance components is essential in deciding on the inclusion of random effects in the context of mixed models. In this work we discuss this problem by supposing nonlinear elliptical models for correlated data by using the score-type test proposed in Silvapulle and Silvapulle (1995). Being asymptotically equivalent to the likelihood ratio test and only requiring the estimation under the null hypothesis, this test provides a fairly easy computable alternative for assessing one-sided hypotheses in the context of the marginal model. Taking into account the possible non-normal distribution, we assume that the joint distribution of the response variable and the random effects lies in the elliptical class, which includes light-tailed and heavy-tailed distributions such as Student-t, power exponential, logistic, generalized Student-t, generalized logistic, contaminated normal, and the normal itself, among others. We compare the sensitivity of the score-type test under normal, Student-t and power exponential models for the kinetics data set discussed in Vonesh and Carter (1992) and fitted using the model presented in Russo et al. (2009). Also, a simulation study is performed to analyze the consequences of the kurtosis misspecification.
Resumo:
Background: Tuberculosis (TB) remains a public health issue worldwide. The lack of specific clinical symptoms to diagnose TB makes the correct decision to admit patients to respiratory isolation a difficult task for the clinician. Isolation of patients without the disease is common and increases health costs. Decision models for the diagnosis of TB in patients attending hospitals can increase the quality of care and decrease costs, without the risk of hospital transmission. We present a predictive model for predicting pulmonary TB in hospitalized patients in a high prevalence area in order to contribute to a more rational use of isolation rooms without increasing the risk of transmission. Methods: Cross sectional study of patients admitted to CFFH from March 2003 to December 2004. A classification and regression tree (CART) model was generated and validated. The area under the ROC curve (AUC), sensitivity, specificity, positive and negative predictive values were used to evaluate the performance of model. Validation of the model was performed with a different sample of patients admitted to the same hospital from January to December 2005. Results: We studied 290 patients admitted with clinical suspicion of TB. Diagnosis was confirmed in 26.5% of them. Pulmonary TB was present in 83.7% of the patients with TB (62.3% with positive sputum smear) and HIV/AIDS was present in 56.9% of patients. The validated CART model showed sensitivity, specificity, positive predictive value and negative predictive value of 60.00%, 76.16%, 33.33%, and 90.55%, respectively. The AUC was 79.70%. Conclusions: The CART model developed for these hospitalized patients with clinical suspicion of TB had fair to good predictive performance for pulmonary TB. The most important variable for prediction of TB diagnosis was chest radiograph results. Prospective validation is still necessary, but our model offer an alternative for decision making in whether to isolate patients with clinical suspicion of TB in tertiary health facilities in countries with limited resources.
Resumo:
Model diagnostics is an integral part of model determination and an important part of the model diagnostics is residual analysis. We adapt and implement residuals considered in the literature for the probit, logistic and skew-probit links under binary regression. New latent residuals for the skew-probit link are proposed here. We have detected the presence of outliers using the residuals proposed here for different models in a simulated dataset and a real medical dataset.
Resumo:
Changepoint regression models have originally been developed in connection with applications in quality control, where a change from the in-control to the out-of-control state has to be detected based on the avaliable random observations. Up to now various changepoint models have been suggested for differents applications like reliability, econometrics or medicine. In many practical situations the covariate cannot be measured precisely and an alternative model are the errors in variable regression models. In this paper we study the regression model with errors in variables with changepoint from a Bayesian approach. From the simulation study we found that the proposed procedure produces estimates suitable for the changepoint and all other model parameters.
Resumo:
For the first time, we introduce a generalized form of the exponentiated generalized gamma distribution [Cordeiro et al. The exponentiated generalized gamma distribution with application to lifetime data, J. Statist. Comput. Simul. 81 (2011), pp. 827-842.] that is the baseline for the log-exponentiated generalized gamma regression model. The new distribution can accommodate increasing, decreasing, bathtub- and unimodal-shaped hazard functions. A second advantage is that it includes classical distributions reported in the lifetime literature as special cases. We obtain explicit expressions for the moments of the baseline distribution of the new regression model. The proposed model can be applied to censored data since it includes as sub-models several widely known regression models. It therefore can be used more effectively in the analysis of survival data. We obtain maximum likelihood estimates for the model parameters by considering censored data. We show that our extended regression model is very useful by means of two applications to real data.
Resumo:
Background: Several models have been designed to predict survival of patients with heart failure. These, while available and widely used for both stratifying and deciding upon different treatment options on the individual level, have several limitations. Specifically, some clinical variables that may influence prognosis may have an influence that change over time. Statistical models that include such characteristic may help in evaluating prognosis. The aim of the present study was to analyze and quantify the impact of modeling heart failure survival allowing for covariates with time-varying effects known to be independent predictors of overall mortality in this clinical setting. Methodology: Survival data from an inception cohort of five hundred patients diagnosed with heart failure functional class III and IV between 2002 and 2004 and followed-up to 2006 were analyzed by using the proportional hazards Cox model and variations of the Cox's model and also of the Aalen's additive model. Principal Findings: One-hundred and eighty eight (188) patients died during follow-up. For patients under study, age, serum sodium, hemoglobin, serum creatinine, and left ventricular ejection fraction were significantly associated with mortality. Evidence of time-varying effect was suggested for the last three. Both high hemoglobin and high LV ejection fraction were associated with a reduced risk of dying with a stronger initial effect. High creatinine, associated with an increased risk of dying, also presented an initial stronger effect. The impact of age and sodium were constant over time. Conclusions: The current study points to the importance of evaluating covariates with time-varying effects in heart failure models. The analysis performed suggests that variations of Cox and Aalen models constitute a valuable tool for identifying these variables. The implementation of covariates with time-varying effects into heart failure prognostication models may reduce bias and increase the specificity of such models.
Resumo:
In this article, for the first time, we propose the negative binomial-beta Weibull (BW) regression model for studying the recurrence of prostate cancer and to predict the cure fraction for patients with clinically localized prostate cancer treated by open radical prostatectomy. The cure model considers that a fraction of the survivors are cured of the disease. The survival function for the population of patients can be modeled by a cure parametric model using the BW distribution. We derive an explicit expansion for the moments of the recurrence time distribution for the uncured individuals. The proposed distribution can be used to model survival data when the hazard rate function is increasing, decreasing, unimodal and bathtub shaped. Another advantage is that the proposed model includes as special sub-models some of the well-known cure rate models discussed in the literature. We derive the appropriate matrices for assessing local influence on the parameter estimates under different perturbation schemes. We analyze a real data set for localized prostate cancer patients after open radical prostatectomy.