19 resultados para regression discrete models
Resumo:
Over the past few decades, the advantages of the visible-near infra-red (VisNIR) diffuse reflectance spectrometer (DRS) method have enabled prediction of soil organic carbon (SOC). In this study, SOC was predicted using regression models for samples taken from three sites (Gununo, Maybar and Anjeni) in Ethiopia. SOC was characterized in laboratory using conventional wet chemistry and VisNIR-DRS methods. Principal component analysis (PCA), principal component regression (PCR) and partial least square regression (PLS) models were developed using Unscrambler X 10.2. PCA results show that the first two components accounted for a minimum of 96% variation which increased for individual sites and with data treatments. Correlation (r), coefficient of determination (R2) and residual prediction deviation (RPD) were used to rate four models built. PLS model (r, R2, RPD) values for Anjeni were 0.9, 0.9 and 3.6; for Gununo values 0.6, 0.3 and 1.2; for Maybar values 0.6, 0.3 and 0.9, and for the three sites values 0.7, 0.6 and 1.5, respectively. PCR model values (r, R2, RPD) for Anjeni were 0.9, 0.8 and 2.7; for Gununo values 0.5, 0.3 and 1; for Maybar values 0.5, 0.1 and 0.7, and for the three sites values 0.7, 0.5 and 1.2, respectively. Comparison and testing of models shows superior performance of PLS to PCR. Models were rated as very poor (Maybar), poor (Gununo and three sites) and excellent (Anjeni). A robust model, Anjeni, is recommended for prediction of SOC in Ethiopia.
Resumo:
OBJECTIVES: This paper is concerned with checking goodness-of-fit of binary logistic regression models. For the practitioners of data analysis, the broad classes of procedures for checking goodness-of-fit available in the literature are described. The challenges of model checking in the context of binary logistic regression are reviewed. As a viable solution, a simple graphical procedure for checking goodness-of-fit is proposed. METHODS: The graphical procedure proposed relies on pieces of information available from any logistic analysis; the focus is on combining and presenting these in an informative way. RESULTS: The information gained using this approach is presented with three examples. In the discussion, the proposed method is put into context and compared with other graphical procedures for checking goodness-of-fit of binary logistic models available in the literature. CONCLUSION: A simple graphical method can significantly improve the understanding of any logistic regression analysis and help to prevent faulty conclusions.
Resumo:
This paper introduces and analyzes a stochastic search method for parameter estimation in linear regression models in the spirit of Beran and Millar [Ann. Statist. 15(3) (1987) 1131–1154]. The idea is to generate a random finite subset of a parameter space which will automatically contain points which are very close to an unknown true parameter. The motivation for this procedure comes from recent work of Dümbgen et al. [Ann. Statist. 39(2) (2011) 702–730] on regression models with log-concave error distributions.
Resumo:
The counterfactual decomposition technique popularized by Blinder (1973, Journal of Human Resources, 436–455) and Oaxaca (1973, International Economic Review, 693–709) is widely used to study mean outcome differences between groups. For example, the technique is often used to analyze wage gaps by sex or race. This article summarizes the technique and addresses several complications, such as the identification of effects of categorical predictors in the detailed decomposition or the estimation of standard errors. A new command called oaxaca is introduced, and examples illustrating its usage are given.
Resumo:
When considering data from many trials, it is likely that some of them present a markedly different intervention effect or exert an undue influence on the summary results. We develop a forward search algorithm for identifying outlying and influential studies in meta-analysis models. The forward search algorithm starts by fitting the hypothesized model to a small subset of likely outlier-free studies and proceeds by adding studies into the set one-by-one that are determined to be closest to the fitted model of the existing set. As each study is added to the set, plots of estimated parameters and measures of fit are monitored to identify outliers by sharp changes in the forward plots. We apply the proposed outlier detection method to two real data sets; a meta-analysis of 26 studies that examines the effect of writing-to-learn interventions on academic achievement adjusting for three possible effect modifiers, and a meta-analysis of 70 studies that compares a fluoride toothpaste treatment to placebo for preventing dental caries in children. A simple simulated example is used to illustrate the steps of the proposed methodology, and a small-scale simulation study is conducted to evaluate the performance of the proposed method. Copyright © 2016 John Wiley & Sons, Ltd.
Resumo:
Although associated with adverse outcomes in other cardiopulmonary diseases, limited evidence exists on the prognostic value of anaemia in patients with acute pulmonary embolism (PE). We sought to examine the associations between anaemia and mortality and length of hospital stay in patients with PE. We evaluated 14,276 patients with a primary diagnosis of PE from 186 hospitals in Pennsylvania, USA. We used random-intercept logistic regression to assess the association between anaemia at the time of presentation and 30-day mortality and discrete-time logistic hazard models to assess the association between anaemia and time to hospital discharge, adjusting for patient (age, gender, race, insurance type, clinical and laboratory variables) and hospital (region, size, teaching status) factors. Anaemia was present in 38.7% of patients at admission. Patients with anaemia had a higher 30-day mortality (13.7% vs. 6.3%; p <0.001) and a longer length of stay (geometric mean, 6.9 vs. 6.6 days; p <0.001) compared to patients without anaemia. In multivariable analyses, anaemia remained associated with an increased odds of death (OR 1.82, 95% CI: 1.60-2.06) and a decreased odds of discharge (OR 0.85, 95% CI: 0.82-0.89). Anaemia is very common in patients presenting with PE and is independently associated with an increased short-term mortality and length of stay.
Resumo:
Background: Accelerometry has been established as an objective method that can be used to assess physical activity behavior in large groups. The purpose of the current study was to provide a validated equation to translate accelerometer counts of the triaxial GT3X into energy expenditure in young children. Methods: Thirty-two children aged 5–9 years performed locomotor and play activities that are typical for their age group. Children wore a GT3X accelerometer and their energy expenditure was measured with indirect calorimetry. Twenty-one children were randomly selected to serve as development group. A cubic 2-regression model involving separate equations for locomotor and play activities was developed on the basis of model fit. It was then validated using data of the remaining children and compared with a linear 2-regression model and a linear 1-regression model. Results: All 3 regression models produced strong correlations between predicted and measured MET values. Agreement was acceptable for the cubic model and good for both linear regression approaches. Conclusions: The current linear 1-regression model provides valid estimates of energy expenditure for ActiGraph GT3X data for 5- to 9-year-old children and shows equal or better predictive validity than a cubic or a linear 2-regression model.
Resumo:
This paper studied two different regression techniques for pelvic shape prediction, i.e., the partial least square regression (PLSR) and the principal component regression (PCR). Three different predictors such as surface landmarks, morphological parameters, or surface models of neighboring structures were used in a cross-validation study to predict the pelvic shape. Results obtained from applying these two different regression techniques were compared to the population mean model. In almost all the prediction experiments, both regression techniques unanimously generated better results than the population mean model, while the difference on prediction accuracy between these two regression methods is not statistically significant (α=0.01).
Resumo:
Graphical presentation of regression results has become increasingly popular in the scientific literature, as graphs are much easier to read than tables in many cases. In Stata such plots can be produced by the -marginsplot- command. However, while -marginsplot- is very versatile and flexible, it has two major limitations: it can only process results left behind by -margins- and it can only handle one set of results at the time. In this article I introduce a new command called -coefplot- that overcomes these limitations. It plots results from any estimation command and combines results from several models into a single graph. The default behavior of -coefplot- is to plot markers for coefficients and horizontal spikes for confidence intervals. However, -coefplot- can also produce various other types of graphs. The capabilities of -coefplot- are illustrated in this article using a series of examples.
Resumo:
coefplot plots results from estimation commands or Stata matrices. Results from multiple models or matrices can be combined in a single graph. The default behavior of coefplot is to draw markers for coefficients and horizontal spikes for confidence intervals. However, coefplot can also produce various other types of graphs.
Resumo:
Purpose Recently, multiple clinical trials have demonstrated improved outcomes in patients with metastatic colorectal cancer. This study investigated if the improved survival is race dependent. Patients and Methods Overall and cancer-specific survival of 77,490 White and Black patients with metastatic colorectal cancer from the 1988–2008 Surveillance Epidemiology and End Results registry were compared using unadjusted and multivariable adjusted Cox proportional hazard regression as well as competing risk analyses. Results Median age was 69 years, 47.4 % were female and 86.0 % White. Median survival was 11 months overall, with an overall increase from 8 to 14 months between 1988 and 2008. Overall survival increased from 8 to 14 months for White, and from 6 to 13 months for Black patients. After multivariable adjustment, the following parameters were associated with better survival: White, female, younger, better educated and married patients, patients with higher income and living in urban areas, patients with rectosigmoid junction and rectal cancer, undergoing cancer-directed surgery, having well/moderately differentiated, and N0 tumors (p<0.05 for all covariates). Discrepancies in overall survival based on race did not change significantly over time; however, there was a significant decrease of cancer-specific survival discrepancies over time between White and Black patients with a hazard ratio of 0.995 (95 % confidence interval 0.991–1.000) per year (p=0.03). Conclusion A clinically relevant overall survival increase was found from 1988 to 2008 in this population-based analysis for both White and Black patients with metastatic colorectal cancer. Although both White and Black patients benefitted from this improvement, a slight discrepancy between the two groups remained.
Resumo:
Graphical display of regression results has become increasingly popular in presentations and in scientific literature because graphs are often much easier to read than tables. Such plots can be produced in Stata by the marginsplot command (see [R] marginsplot). However, while marginsplot is versatile and flexible, it has two major limitations: it can only process results left behind by margins (see [R] margins), and it can handle only one set of results at a time. In this article, I introduce a new command called coefplot that overcomes these limitations. It plots results from any estimation command and combines results from several models into one graph. The default behavior of coefplot is to plot markers for coefficients and horizontal spikes for confidence intervals. However, coefplot can also produce other types of graphs. I illustrate the capabilities of coefplot by using a series of examples.
Resumo:
Organizing and archiving statistical results and processing a subset of those results for publication are important and often underestimated issues in conducting statistical analyses. Because automation of these tasks is often poor, processing results produced by statistical packages is quite laborious and vulnerable to error. I will therefore present a new package called estout that facilitates and automates some of these tasks. This new command can be used to produce regression tables for use with spreadsheets, LaTeX, HTML, or word processors. For example, the results for multiple models can be organized in spreadsheets and can thus be archived in an orderly manner. Alternatively, the results can be directly saved as a publication-ready table for inclusion in, for example, a LaTeX document. estout is implemented as a wrapper for estimates table but has many additional features, such as support for mfx. However, despite its flexibility, estout is—I believe—still very straightforward and easy to use. Furthermore, estout can be customized via so-called defaults files. A tool to make available supplementary statistics called estadd is also provided.
Resumo:
robreg provides a number of robust estimators for linear regression models. Among them are the high breakdown-point and high efficiency MM-estimator, the Huber and bisquare M-estimator, and the S-estimator, each supporting classic or robust standard errors. Furthermore, basic versions of the LMS/LQS (least median of squares) and LTS (least trimmed squares) estimators are provided. Note that the moremata package, also available from SSC, is required.