15 resultados para Models performance
em DigitalCommons@The Texas Medical Center
Resumo:
It is system dynamics that determines the function of cells, tissues and organisms. To develop mathematical models and estimate their parameters are an essential issue for studying dynamic behaviors of biological systems which include metabolic networks, genetic regulatory networks and signal transduction pathways, under perturbation of external stimuli. In general, biological dynamic systems are partially observed. Therefore, a natural way to model dynamic biological systems is to employ nonlinear state-space equations. Although statistical methods for parameter estimation of linear models in biological dynamic systems have been developed intensively in the recent years, the estimation of both states and parameters of nonlinear dynamic systems remains a challenging task. In this report, we apply extended Kalman Filter (EKF) to the estimation of both states and parameters of nonlinear state-space models. To evaluate the performance of the EKF for parameter estimation, we apply the EKF to a simulation dataset and two real datasets: JAK-STAT signal transduction pathway and Ras/Raf/MEK/ERK signaling transduction pathways datasets. The preliminary results show that EKF can accurately estimate the parameters and predict states in nonlinear state-space equations for modeling dynamic biochemical networks.
Resumo:
This paper reports a comparison of three modeling strategies for the analysis of hospital mortality in a sample of general medicine inpatients in a Department of Veterans Affairs medical center. Logistic regression, a Markov chain model, and longitudinal logistic regression were evaluated on predictive performance as measured by the c-index and on accuracy of expected numbers of deaths compared to observed. The logistic regression used patient information collected at admission; the Markov model was comprised of two absorbing states for discharge and death and three transient states reflecting increasing severity of illness as measured by laboratory data collected during the hospital stay; longitudinal regression employed Generalized Estimating Equations (GEE) to model covariance structure for the repeated binary outcome. Results showed that the logistic regression predicted hospital mortality as well as the alternative methods but was limited in scope of application. The Markov chain provides insights into how day to day changes of illness severity lead to discharge or death. The longitudinal logistic regression showed that increasing illness trajectory is associated with hospital mortality. The conclusion is reached that for standard applications in modeling hospital mortality, logistic regression is adequate, but for new challenges facing health services research today, alternative methods are equally predictive, practical, and can provide new insights. ^
Resumo:
Random Forests™ is reported to be one of the most accurate classification algorithms in complex data analysis. It shows excellent performance even when most predictors are noisy and the number of variables is much larger than the number of observations. In this thesis Random Forests was applied to a large-scale lung cancer case-control study. A novel way of automatically selecting prognostic factors was proposed. Also, synthetic positive control was used to validate Random Forests method. Throughout this study we showed that Random Forests can deal with large number of weak input variables without overfitting. It can account for non-additive interactions between these input variables. Random Forests can also be used for variable selection without being adversely affected by collinearities. ^ Random Forests can deal with the large-scale data sets without rigorous data preprocessing. It has robust variable importance ranking measure. Proposed is a novel variable selection method in context of Random Forests that uses the data noise level as the cut-off value to determine the subset of the important predictors. This new approach enhanced the ability of the Random Forests algorithm to automatically identify important predictors for complex data. The cut-off value can also be adjusted based on the results of the synthetic positive control experiments. ^ When the data set had high variables to observations ratio, Random Forests complemented the established logistic regression. This study suggested that Random Forests is recommended for such high dimensionality data. One can use Random Forests to select the important variables and then use logistic regression or Random Forests itself to estimate the effect size of the predictors and to classify new observations. ^ We also found that the mean decrease of accuracy is a more reliable variable ranking measurement than mean decrease of Gini. ^
Resumo:
With the recognition of the importance of evidence-based medicine, there is an emerging need for methods to systematically synthesize available data. Specifically, methods to provide accurate estimates of test characteristics for diagnostic tests are needed to help physicians make better clinical decisions. To provide more flexible approaches for meta-analysis of diagnostic tests, we developed three Bayesian generalized linear models. Two of these models, a bivariate normal and a binomial model, analyzed pairs of sensitivity and specificity values while incorporating the correlation between these two outcome variables. Noninformative independent uniform priors were used for the variance of sensitivity, specificity and correlation. We also applied an inverse Wishart prior to check the sensitivity of the results. The third model was a multinomial model where the test results were modeled as multinomial random variables. All three models can include specific imaging techniques as covariates in order to compare performance. Vague normal priors were assigned to the coefficients of the covariates. The computations were carried out using the 'Bayesian inference using Gibbs sampling' implementation of Markov chain Monte Carlo techniques. We investigated the properties of the three proposed models through extensive simulation studies. We also applied these models to a previously published meta-analysis dataset on cervical cancer as well as to an unpublished melanoma dataset. In general, our findings show that the point estimates of sensitivity and specificity were consistent among Bayesian and frequentist bivariate normal and binomial models. However, in the simulation studies, the estimates of the correlation coefficient from Bayesian bivariate models are not as good as those obtained from frequentist estimation regardless of which prior distribution was used for the covariance matrix. The Bayesian multinomial model consistently underestimated the sensitivity and specificity regardless of the sample size and correlation coefficient. In conclusion, the Bayesian bivariate binomial model provides the most flexible framework for future applications because of its following strengths: (1) it facilitates direct comparison between different tests; (2) it captures the variability in both sensitivity and specificity simultaneously as well as the intercorrelation between the two; and (3) it can be directly applied to sparse data without ad hoc correction. ^
Resumo:
Objective. The study reviewed one year of Texas hospital discharge data and Trauma Registry data for the 22 trauma services regions in Texas to identify regional variations in capacity, process of care and clinical outcomes for trauma patients, and analyze the statistical associations among capacity, process of care, and outcomes. ^ Methods. Cross sectional study design covering one year of state-wide Texas data. Indicators of trauma capacity, trauma care processes, and clinical outcomes were defined and data were collected on each indicator. Descriptive analyses were conducted of regional variations in trauma capacity, process of care, and clinical outcomes at all trauma centers, at Level I and II trauma centers and at Level III and IV trauma centers. Multilevel regression models were performed to test the relations among trauma capacity, process of care, and outcome measures at all trauma centers, at Level I and II trauma centers and at Level III and IV trauma centers while controlling for confounders such as age, gender, race/ethnicity, injury severity, level of trauma centers and urbanization. ^ Results. Significant regional variation was found among the 22 trauma services regions across Texas in trauma capacity, process of care, and clinical outcomes. The regional trauma bed rate, the average staffed bed per 100,000 varied significantly by trauma service region. Pre-hospital trauma care processes were significantly variable by region---EMS time, transfer time, and triage. Clinical outcomes including mortality, hospital and intensive care unit length of stay, and hospital charges also varied significantly by region. In multilevel regression analysis, the average trauma bed rate was significantly related to trauma care processes including ambulance delivery time, transfer time, and triage after controlling for age, gender, race/ethnicity, injury severity, level of trauma centers, and urbanization at all trauma centers. Transfer time only among processes of care was significant with the average trauma bed rate by region at Level III and IV. Also trauma mortality only among outcomes measures was significantly associated with the average trauma bed rate by region at all trauma centers. Hospital charges only among outcomes measures were statistically related to trauma bed rate at Level I and II trauma centers. The effect of confounders on processes and outcomes such as age, gender, race/ethnicity, injury severity, and urbanization was found significantly variable by level of trauma centers. ^ Conclusions. Regional variation in trauma capacity, process, and outcomes in Texas was extensive. Trauma capacity, age, gender, race/ethnicity, injury severity, level of trauma centers and urbanization were significantly associated with trauma process and clinical outcomes depending on level of trauma centers. ^ Key words: regionalized trauma systems, trauma capacity, pre-hospital trauma care, process, trauma outcomes, trauma performance, evaluation measures, regional variations ^
Resumo:
Although the area under the receiver operating characteristic (AUC) is the most popular measure of the performance of prediction models, it has limitations, especially when it is used to evaluate the added discrimination of a new biomarker in the model. Pencina et al. (2008) proposed two indices, the net reclassification improvement (NRI) and integrated discrimination improvement (IDI), to supplement the improvement in the AUC (IAUC). Their NRI and IDI are based on binary outcomes in case-control settings, which do not involve time-to-event outcome. However, many disease outcomes are time-dependent and the onset time can be censored. Measuring discrimination potential of a prognostic marker without considering time to event can lead to biased estimates. In this dissertation, we have extended the NRI and IDI to survival analysis settings and derived the corresponding sample estimators and asymptotic tests. Simulation studies were conducted to compare the performance of the time-dependent NRI and IDI with Pencina’s NRI and IDI. For illustration, we have applied the proposed method to a breast cancer study.^ Key words: Prognostic model, Discrimination, Time-dependent NRI and IDI ^
Resumo:
The sensitivity of Interferon-γ release assays for detection of Mycobacterium tuberculosis (MTB) infection or disease is affected by conditions that depress host immunity (such as HIV). It is critical to determine whether these assays are affected by diabetes and related conditions (i.e. hyperglycemia, chronic hyperglycemia, or being overweight/obese) given that immune impairment is thought to underline susceptibility to tuberculosis (TB) in people with diabetes. This is important for tuberculosis control due to the millions of type 2 diabetes patients at risk for tuberculosis worldwide.^ The objective of this study was to identify host characteristics, including diabetes, that may affect the sensitivity of two commercially available Interferon-γ (IFN-γ) release assays (IGRA), the QuantiFERON®-TB Gold (QFT-G) and the T-SPOT®.TB in active TB patients. We further explored whether IFN-γ secretion in response to MTB antigens (ESAT-6 and CFP-10) is associated with diabetes and its defining characteristics (high blood glucose, high HbA1c, high BMI). To achieve these objectives, the sensitivity of QFT-G and T-SPOT. TB assays were evaluated in newly diagnosed, tuberculosis confirmed (by positive smear for acid fast bacilli and/or positive culture for MTB) adults enrolled at Texas and Mexico study sites between March 2006 and April 2009. Univariate and multivariate models were constructed to identify host characteristics associated with IGRA result and level of IFN-γ secretion.^ QFT-G was positive in 68% of tuberculosis patients. Those with diabetes, chronic hyperglycemia or obesity were more likely to have a positive QFT-G result, and to secrete higher levels of IFN-γ in response to the mycobacterial antigens (p<0.05). Previous history of BCG vaccination was the only other host characteristic associated with QFT-G result, whereby a higher proportion of non-BCG vaccinated persons were QFT-G positive, in comparison to vaccinated persons. In a separate group of patients, the T-SPOT.TB was 94% sensitive, with similar performance in all tuberculosis patients, regardless of host characteristics.^ In summary, we have demonstrated the validity of QFT-G and T-SPOT. TB to support the diagnosis of TB in patients with a range of host characteristics, but most notably in patients with diabetes. We also confirmed that TB patients with diabetes and associated characteristics (chronic hyperglycemia or BMI) secreted higher titers of IFN-γ when stimulated with MTB specific antigens, in comparison to patients without these characteristics. Together, these findings suggest that the mechanism by which diabetes increases risk to TB may not be explained by the inability to secrete IFN-γ, a key cytokine for TB control.^
Resumo:
Strategies are compared for the development of a linear regression model with stochastic (multivariate normal) regressor variables and the subsequent assessment of its predictive ability. Bias and mean squared error of four estimators of predictive performance are evaluated in simulated samples of 32 population correlation matrices. Models including all of the available predictors are compared with those obtained using selected subsets. The subset selection procedures investigated include two stopping rules, C$\sb{\rm p}$ and S$\sb{\rm p}$, each combined with an 'all possible subsets' or 'forward selection' of variables. The estimators of performance utilized include parametric (MSEP$\sb{\rm m}$) and non-parametric (PRESS) assessments in the entire sample, and two data splitting estimates restricted to a random or balanced (Snee's DUPLEX) 'validation' half sample. The simulations were performed as a designed experiment, with population correlation matrices representing a broad range of data structures.^ The techniques examined for subset selection do not generally result in improved predictions relative to the full model. Approaches using 'forward selection' result in slightly smaller prediction errors and less biased estimators of predictive accuracy than 'all possible subsets' approaches but no differences are detected between the performances of C$\sb{\rm p}$ and S$\sb{\rm p}$. In every case, prediction errors of models obtained by subset selection in either of the half splits exceed those obtained using all predictors and the entire sample.^ Only the random split estimator is conditionally (on $\\beta$) unbiased, however MSEP$\sb{\rm m}$ is unbiased on average and PRESS is nearly so in unselected (fixed form) models. When subset selection techniques are used, MSEP$\sb{\rm m}$ and PRESS always underestimate prediction errors, by as much as 27 percent (on average) in small samples. Despite their bias, the mean squared errors (MSE) of these estimators are at least 30 percent less than that of the unbiased random split estimator. The DUPLEX split estimator suffers from large MSE as well as bias, and seems of little value within the context of stochastic regressor variables.^ To maximize predictive accuracy while retaining a reliable estimate of that accuracy, it is recommended that the entire sample be used for model development, and a leave-one-out statistic (e.g. PRESS) be used for assessment. ^
Resumo:
Much of the current healthcare financial literature addresses the concern of government officials, the public, and healthcare providers regarding the need for control of health care costs. The literature suggests that attitudes of hospital department managers toward their role in financial management affects their ability to effect favorable financial results.^ There were several objectives of the dissertation: (1) To identify whether or not there exists a relationship between the attitude/role perception of hospital managers and the financial performance of their departments. (2) To compile a descriptive survey data base of key factors identified in the financial literature from individual hospitals. (3) To compile a brief descriptive survey of hospital managers' financial management background and training (both formal and informal). (4) To conduct an attitude assessment/role perception survey regarding the importance or relevance of a suggested financial management role set (i.e., issues discussed in the current literature) as viewed by the selected hospital managers and their matched administrators. (5) To propose plausible theoretical models and statistical tests of seven proposed hypotheses.^ The statistical results of a variety of methods generally suggested, for the sample population, that the null hypothesis should not be rejected concerning the relationships between a department manager's financial attitudes and role perceptions and the resultant financial performance.^ The fact that the results of this study did not suggest that there was a significant relationship which existed between role perception and financial performance does not necessarily indicate that the theories supporting such a relationship in literature are false, not that such a relationship does not exist. Several alternative theories were postulated to explain the apparent lack of statistical relationship, and suggestions for refinement and/or improvement of further research were discussed. ^
Resumo:
Complex diseases, such as cancer, are caused by various genetic and environmental factors, and their interactions. Joint analysis of these factors and their interactions would increase the power to detect risk factors but is statistically. Bayesian generalized linear models using student-t prior distributions on coefficients, is a novel method to simultaneously analyze genetic factors, environmental factors, and interactions. I performed simulation studies using three different disease models and demonstrated that the variable selection performance of Bayesian generalized linear models is comparable to that of Bayesian stochastic search variable selection, an improved method for variable selection when compared to standard methods. I further evaluated the variable selection performance of Bayesian generalized linear models using different numbers of candidate covariates and different sample sizes, and provided a guideline for required sample size to achieve a high power of variable selection using Bayesian generalize linear models, considering different scales of number of candidate covariates. ^ Polymorphisms in folate metabolism genes and nutritional factors have been previously associated with lung cancer risk. In this study, I simultaneously analyzed 115 tag SNPs in folate metabolism genes, 14 nutritional factors, and all possible genetic-nutritional interactions from 1239 lung cancer cases and 1692 controls using Bayesian generalized linear models stratified by never, former, and current smoking status. SNPs in MTRR were significantly associated with lung cancer risk across never, former, and current smokers. In never smokers, three SNPs in TYMS and three gene-nutrient interactions, including an interaction between SHMT1 and vitamin B12, an interaction between MTRR and total fat intake, and an interaction between MTR and alcohol use, were also identified as associated with lung cancer risk. These lung cancer risk factors are worthy of further investigation.^
Resumo:
In light of the new healthcare regulations, hospitals are increasingly reevaluating their IT integration strategies to meet expanded healthcare information exchange requirements. Nevertheless, hospital executives do not have all the information they need to differentiate between the available strategies and recognize what may better fit their organizational needs. ^ In the interest of providing the desired information, this study explored the relationships between hospital financial performance, integration strategy selection, and strategy change. The integration strategies examined – applied as binary logistic regression dependent variables and in the order from most to least integrated – were Single-Vendor (SV), Best-of-Suite (BoS), and Best-of-Breed (BoB). In addition, the financial measurements adopted as independent variables for the models were two administrative labor efficiency and six industry standard financial ratios designed to provide a broad proxy of hospital financial performance. Furthermore, descriptive statistical analyses were carried out to evaluate recent trends in hospital integration strategy change. Overall six research questions were proposed for this study. ^ The first research question sought to answer if financial performance was related to the selection of integration strategies. The next questions, however, explored whether hospitals were more likely to change strategies or remain the same when there was no external stimulus to change, and if they did change, they would prefer strategies closer to the existing ones. These were followed by a question that inquired if financial performance was also related to strategy change. Nevertheless, rounding up the questions, the last two probed if the new Health Information Technology for Economic and Clinical Health (HITECH) Act had any impact on the frequency and direction of strategy change. ^ The results confirmed that financial performance is related to both IT integration strategy selection and strategy change, while concurred with prior studies that suggested hospital and environmental characteristics are associated factors as well. Specifically this study noted that the most integrated SV strategy is related to increased administrative labor efficiency and the hybrid BoS strategy is associated with improved financial health (based on operating margin and equity financing ratios). On the other hand, no financial indicators were found to be related to the least integrated BoB strategy, except for short-term liquidity (current ratio) when involving strategy change. ^ Ultimately, this study concluded that when making IT integration strategy decisions hospitals closely follow the resource dependence view of minimizing uncertainty. As each integration strategy may favor certain organizational characteristics, hospitals traditionally preferred not to make strategy changes and when they did, they selected strategies that were more closely related to the existing ones. However, as new regulations further heighten revenue uncertainty while require increased information integration, moving forward, as evidence already suggests a growing trend of organizations shifting towards more integrated strategies, hospitals may be more limited in their strategy selection choices.^
Resumo:
Objective. In 2009, the International Expert Committee recommended the use of HbA1c test for diagnosis of diabetes. Although it has been recommended for the diagnosis of diabetes, its precise test performance among Mexican Americans is uncertain. A strong “gold standard” would rely on repeated blood glucose measurement on different days, which is the recommended method for diagnosing diabetes in clinical practice. Our objective was to assess test performance of HbA1c in detecting diabetes and pre-diabetes against repeated fasting blood glucose measurement for the Mexican American population living in United States-Mexico border. Moreover, we wanted to find out a specific and precise threshold value of HbA1c for Diabetes Mellitus (DM) and pre-diabetes for this high-risk population which might assist in better diagnosis and better management of patient diabetes. ^ Research design and methods. We used CCHC dataset for our study. In 2004, the Cameron County Hispanic Cohort (CCHC), now numbering 2,574, was established drawn from randomly selected households on the basis of 2000 Census tract data. The CCHC study randomly selected a subset of people (aged 18-64 years) in CCHC cohort households to determine the influence of SES on diabetes and obesity. Among the participants in Cohort-2000, 67.15% are female; all are Hispanic. ^ Individuals were defined as having diabetes mellitus (Fasting plasma glucose [FPG] ≥ 126 mg/dL or pre-diabetes (100 ≤ FPG < 126 mg/dL). HbA1c test performance was evaluated using receiver operator characteristic (ROC) curves. Moreover, change-point models were used to determine HbA1c thresholds compatible with FPG thresholds for diabetes and pre-diabetes. ^ Results. When assessing Fasting Plasma Glucose (FPG) is used to detect diabetes, the sensitivity and specificity of HbA1c≥ 6.5% was 75% and 87% respectively (area under the curve 0.895). Additionally, when assessing FPG to detect pre-diabetes, the sensitivity and specificity of HbA1c≥ 6.0% (ADA recommended threshold) was 18% and 90% respectively. The sensitivity and specificity of HbA1c≥ 5.7% (International Expert Committee recommended threshold) for detecting pre-diabetes was 31% and 78% respectively. ROC analyses suggest HbA1c as a sound predictor of diabetes mellitus (area under the curve 0.895) but a poorer predictor for pre-diabetes (area under the curve 0.632). ^ Conclusions. Our data support the current recommendations for use of HbA1c in the diagnosis of diabetes for the Mexican American population as it has shown reasonable sensitivity, specificity and accuracy against repeated FPG measures. However, use of HbA1c may be premature for detecting pre-diabetes in this specific population because of the poor sensitivity with FPG. It might be the case that HbA1c is differentiating the cases more effectively who are at risk of developing diabetes. Following these pre-diabetic individuals for a longer-term for the detection of incident diabetes may lead to more confirmatory result.^
Resumo:
My dissertation focuses on developing methods for gene-gene/environment interactions and imprinting effect detections for human complex diseases and quantitative traits. It includes three sections: (1) generalizing the Natural and Orthogonal interaction (NOIA) model for the coding technique originally developed for gene-gene (GxG) interaction and also to reduced models; (2) developing a novel statistical approach that allows for modeling gene-environment (GxE) interactions influencing disease risk, and (3) developing a statistical approach for modeling genetic variants displaying parent-of-origin effects (POEs), such as imprinting. In the past decade, genetic researchers have identified a large number of causal variants for human genetic diseases and traits by single-locus analysis, and interaction has now become a hot topic in the effort to search for the complex network between multiple genes or environmental exposures contributing to the outcome. Epistasis, also known as gene-gene interaction is the departure from additive genetic effects from several genes to a trait, which means that the same alleles of one gene could display different genetic effects under different genetic backgrounds. In this study, we propose to implement the NOIA model for association studies along with interaction for human complex traits and diseases. We compare the performance of the new statistical models we developed and the usual functional model by both simulation study and real data analysis. Both simulation and real data analysis revealed higher power of the NOIA GxG interaction model for detecting both main genetic effects and interaction effects. Through application on a melanoma dataset, we confirmed the previously identified significant regions for melanoma risk at 15q13.1, 16q24.3 and 9p21.3. We also identified potential interactions with these significant regions that contribute to melanoma risk. Based on the NOIA model, we developed a novel statistical approach that allows us to model effects from a genetic factor and binary environmental exposure that are jointly influencing disease risk. Both simulation and real data analyses revealed higher power of the NOIA model for detecting both main genetic effects and interaction effects for both quantitative and binary traits. We also found that estimates of the parameters from logistic regression for binary traits are no longer statistically uncorrelated under the alternative model when there is an association. Applying our novel approach to a lung cancer dataset, we confirmed four SNPs in 5p15 and 15q25 region to be significantly associated with lung cancer risk in Caucasians population: rs2736100, rs402710, rs16969968 and rs8034191. We also validated that rs16969968 and rs8034191 in 15q25 region are significantly interacting with smoking in Caucasian population. Our approach identified the potential interactions of SNP rs2256543 in 6p21 with smoking on contributing to lung cancer risk. Genetic imprinting is the most well-known cause for parent-of-origin effect (POE) whereby a gene is differentially expressed depending on the parental origin of the same alleles. Genetic imprinting affects several human disorders, including diabetes, breast cancer, alcoholism, and obesity. This phenomenon has been shown to be important for normal embryonic development in mammals. Traditional association approaches ignore this important genetic phenomenon. In this study, we propose a NOIA framework for a single locus association study that estimates both main allelic effects and POEs. We develop statistical (Stat-POE) and functional (Func-POE) models, and demonstrate conditions for orthogonality of the Stat-POE model. We conducted simulations for both quantitative and qualitative traits to evaluate the performance of the statistical and functional models with different levels of POEs. Our results showed that the newly proposed Stat-POE model, which ensures orthogonality of variance components if Hardy-Weinberg Equilibrium (HWE) or equal minor and major allele frequencies is satisfied, had greater power for detecting the main allelic additive effect than a Func-POE model, which codes according to allelic substitutions, for both quantitative and qualitative traits. The power for detecting the POE was the same for the Stat-POE and Func-POE models under HWE for quantitative traits.
Resumo:
Prevalent sampling is an efficient and focused approach to the study of the natural history of disease. Right-censored time-to-event data observed from prospective prevalent cohort studies are often subject to left-truncated sampling. Left-truncated samples are not randomly selected from the population of interest and have a selection bias. Extensive studies have focused on estimating the unbiased distribution given left-truncated samples. However, in many applications, the exact date of disease onset was not observed. For example, in an HIV infection study, the exact HIV infection time is not observable. However, it is known that the HIV infection date occurred between two observable dates. Meeting these challenges motivated our study. We propose parametric models to estimate the unbiased distribution of left-truncated, right-censored time-to-event data with uncertain onset times. We first consider data from a length-biased sampling, a specific case in left-truncated samplings. Then we extend the proposed method to general left-truncated sampling. With a parametric model, we construct the full likelihood, given a biased sample with unobservable onset of disease. The parameters are estimated through the maximization of the constructed likelihood by adjusting the selection bias and unobservable exact onset. Simulations are conducted to evaluate the finite sample performance of the proposed methods. We apply the proposed method to an HIV infection study, estimating the unbiased survival function and covariance coefficients. ^
Resumo:
The performance of the Hosmer-Lemeshow global goodness-of-fit statistic for logistic regression models was explored in a wide variety of conditions not previously fully investigated. Computer simulations, each consisting of 500 regression models, were run to assess the statistic in 23 different situations. The items which varied among the situations included the number of observations used in each regression, the number of covariates, the degree of dependence among the covariates, the combinations of continuous and discrete variables, and the generation of the values of the dependent variable for model fit or lack of fit.^ The study found that the $\rm\ C$g* statistic was adequate in tests of significance for most situations. However, when testing data which deviate from a logistic model, the statistic has low power to detect such deviation. Although grouping of the estimated probabilities into quantiles from 8 to 30 was studied, the deciles of risk approach was generally sufficient. Subdividing the estimated probabilities into more than 10 quantiles when there are many covariates in the model is not necessary, despite theoretical reasons which suggest otherwise. Because it does not follow a X$\sp2$ distribution, the statistic is not recommended for use in models containing only categorical variables with a limited number of covariate patterns.^ The statistic performed adequately when there were at least 10 observations per quantile. Large numbers of observations per quantile did not lead to incorrect conclusions that the model did not fit the data when it actually did. However, the statistic failed to detect lack of fit when it existed and should be supplemented with further tests for the influence of individual observations. Careful examination of the parameter estimates is also essential since the statistic did not perform as desired when there was moderate to severe collinearity among covariates.^ Two methods studied for handling tied values of the estimated probabilities made only a slight difference in conclusions about model fit. Neither method split observations with identical probabilities into different quantiles. Approaches which create equal size groups by separating ties should be avoided. ^