6 resultados para PREDICTIVE PERFORMANCE

em DigitalCommons@The Texas Medical Center


Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper reports a comparison of three modeling strategies for the analysis of hospital mortality in a sample of general medicine inpatients in a Department of Veterans Affairs medical center. Logistic regression, a Markov chain model, and longitudinal logistic regression were evaluated on predictive performance as measured by the c-index and on accuracy of expected numbers of deaths compared to observed. The logistic regression used patient information collected at admission; the Markov model was comprised of two absorbing states for discharge and death and three transient states reflecting increasing severity of illness as measured by laboratory data collected during the hospital stay; longitudinal regression employed Generalized Estimating Equations (GEE) to model covariance structure for the repeated binary outcome. Results showed that the logistic regression predicted hospital mortality as well as the alternative methods but was limited in scope of application. The Markov chain provides insights into how day to day changes of illness severity lead to discharge or death. The longitudinal logistic regression showed that increasing illness trajectory is associated with hospital mortality. The conclusion is reached that for standard applications in modeling hospital mortality, logistic regression is adequate, but for new challenges facing health services research today, alternative methods are equally predictive, practical, and can provide new insights. ^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Strategies are compared for the development of a linear regression model with stochastic (multivariate normal) regressor variables and the subsequent assessment of its predictive ability. Bias and mean squared error of four estimators of predictive performance are evaluated in simulated samples of 32 population correlation matrices. Models including all of the available predictors are compared with those obtained using selected subsets. The subset selection procedures investigated include two stopping rules, C$\sb{\rm p}$ and S$\sb{\rm p}$, each combined with an 'all possible subsets' or 'forward selection' of variables. The estimators of performance utilized include parametric (MSEP$\sb{\rm m}$) and non-parametric (PRESS) assessments in the entire sample, and two data splitting estimates restricted to a random or balanced (Snee's DUPLEX) 'validation' half sample. The simulations were performed as a designed experiment, with population correlation matrices representing a broad range of data structures.^ The techniques examined for subset selection do not generally result in improved predictions relative to the full model. Approaches using 'forward selection' result in slightly smaller prediction errors and less biased estimators of predictive accuracy than 'all possible subsets' approaches but no differences are detected between the performances of C$\sb{\rm p}$ and S$\sb{\rm p}$. In every case, prediction errors of models obtained by subset selection in either of the half splits exceed those obtained using all predictors and the entire sample.^ Only the random split estimator is conditionally (on $\\beta$) unbiased, however MSEP$\sb{\rm m}$ is unbiased on average and PRESS is nearly so in unselected (fixed form) models. When subset selection techniques are used, MSEP$\sb{\rm m}$ and PRESS always underestimate prediction errors, by as much as 27 percent (on average) in small samples. Despite their bias, the mean squared errors (MSE) of these estimators are at least 30 percent less than that of the unbiased random split estimator. The DUPLEX split estimator suffers from large MSE as well as bias, and seems of little value within the context of stochastic regressor variables.^ To maximize predictive accuracy while retaining a reliable estimate of that accuracy, it is recommended that the entire sample be used for model development, and a leave-one-out statistic (e.g. PRESS) be used for assessment. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

OBJECTIVE: We sought to evaluate the performance of the human papillomavirus high-risk DNA test in patients 30 years and older. MATERIALS AND METHODS: Screening (n=835) and diagnosis (n=518) groups were defined based on prior Papanicolaou smear results as part of a clinical trial for cervical cancer detection. We compared the Hybrid Capture II (HCII) test result with the worst histologic report. We used cervical intraepithelial neoplasia (CIN) 2/3 or worse as the reference of disease. We calculated sensitivities, specificities, positive and negative likelihood ratios (LR+ and LR-), receiver operating characteristic (ROC) curves, and areas under the ROC curves for the HCII test. We also considered alternative strategies, including Papanicolaou smear, a combination of Papanicolaou smear and the HCII test, a sequence of Papanicolaou smear followed by the HCII test, and a sequence of the HCII test followed by Papanicolaou smear. RESULTS: For the screening group, the sensitivity was 0.69 and the specificity was 0.93; the area under the ROC curve was 0.81. The LR+ and LR- were 10.24 and 0.34, respectively. For the diagnosis group, the sensitivity was 0.88 and the specificity was 0.78; the area under the ROC curve was 0.83. The LR+ and LR- were 4.06 and 0.14, respectively. Sequential testing showed little or no improvement over the combination testing. CONCLUSIONS: The HCII test in the screening group had a greater LR+ for the detection of CIN 2/3 or worse. HCII testing may be an additional screening tool for cervical cancer in women 30 years and older.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The research project is an extension of a series of administrative science and health care research projects evaluating the influence of external context, organizational strategy, and organizational structure upon organizational success or performance. The research will rely on the assumption that there is not one single best approach to the management of organizations (the contingency theory). As organizational effectiveness is dependent on an appropriate mix of factors, organizations may be equally effective based on differing combinations of factors. The external context of the organization is expected to influence internal organizational strategy and structure and in turn the internal measures affect performance (discriminant theory). The research considers the relationship of external context and organization performance.^ The unit of study for the research will be the health maintenance organization (HMO); an organization the accepts in exchange for a fixed, advance capitation payment, contractual responsibility to assure the delivery of a stated range of health sevices to a voluntary enrolled population. With the current Federal resurgence of interest in the Health Maintenance Organization (HMO) as a major component in the health care system, attention must be directed at maximizing development of HMOs from the limited resources available. Increased skills are needed in both Federal and private evaluation of HMO feasibility in order to prevent resource investment and in projects that will fail while concurrently identifying potentially successful projects that will not be considered using current standards.^ The research considers 192 factors measuring contextual milieu (social, educational, economic, legal, demographic, health and technological factors). Through intercorrelation and principle components data reduction techniques this was reduced to 12 variables. Two measures of HMO performance were identified, they are (1) HMO status (operational or defunct), and (2) a principle components factor score considering eight measures of performance. The relationship between HMO context and performance was analysed using correlation and stepwise multiple regression methods. In each case it has been concluded that the external contextual variables are not predictive of success or failure of study Health Maintenance Organizations. This suggests that performance of an HMO may rely on internal organizational factors. These findings have policy implications as contextual measures are used as a major determinant in HMO feasibility analysis, and as a factor in the allocation of limited Federal funds. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The first manuscript, entitled "Time-Series Analysis as Input for Clinical Predictive Modeling: Modeling Cardiac Arrest in a Pediatric ICU" lays out the theoretical background for the project. There are several core concepts presented in this paper. First, traditional multivariate models (where each variable is represented by only one value) provide single point-in-time snapshots of patient status: they are incapable of characterizing deterioration. Since deterioration is consistently identified as a precursor to cardiac arrests, we maintain that the traditional multivariate paradigm is insufficient for predicting arrests. We identify time series analysis as a method capable of characterizing deterioration in an objective, mathematical fashion, and describe how to build a general foundation for predictive modeling using time series analysis results as latent variables. Building a solid foundation for any given modeling task involves addressing a number of issues during the design phase. These include selecting the proper candidate features on which to base the model, and selecting the most appropriate tool to measure them. We also identified several unique design issues that are introduced when time series data elements are added to the set of candidate features. One such issue is in defining the duration and resolution of time series elements required to sufficiently characterize the time series phenomena being considered as candidate features for the predictive model. Once the duration and resolution are established, there must also be explicit mathematical or statistical operations that produce the time series analysis result to be used as a latent candidate feature. In synthesizing the comprehensive framework for building a predictive model based on time series data elements, we identified at least four classes of data that can be used in the model design. The first two classes are shared with traditional multivariate models: multivariate data and clinical latent features. Multivariate data is represented by the standard one value per variable paradigm and is widely employed in a host of clinical models and tools. These are often represented by a number present in a given cell of a table. Clinical latent features derived, rather than directly measured, data elements that more accurately represent a particular clinical phenomenon than any of the directly measured data elements in isolation. The second two classes are unique to the time series data elements. The first of these is the raw data elements. These are represented by multiple values per variable, and constitute the measured observations that are typically available to end users when they review time series data. These are often represented as dots on a graph. The final class of data results from performing time series analysis. This class of data represents the fundamental concept on which our hypothesis is based. The specific statistical or mathematical operations are up to the modeler to determine, but we generally recommend that a variety of analyses be performed in order to maximize the likelihood that a representation of the time series data elements is produced that is able to distinguish between two or more classes of outcomes. The second manuscript, entitled "Building Clinical Prediction Models Using Time Series Data: Modeling Cardiac Arrest in a Pediatric ICU" provides a detailed description, start to finish, of the methods required to prepare the data, build, and validate a predictive model that uses the time series data elements determined in the first paper. One of the fundamental tenets of the second paper is that manual implementations of time series based models are unfeasible due to the relatively large number of data elements and the complexity of preprocessing that must occur before data can be presented to the model. Each of the seventeen steps is analyzed from the perspective of how it may be automated, when necessary. We identify the general objectives and available strategies of each of the steps, and we present our rationale for choosing a specific strategy for each step in the case of predicting cardiac arrest in a pediatric intensive care unit. Another issue brought to light by the second paper is that the individual steps required to use time series data for predictive modeling are more numerous and more complex than those used for modeling with traditional multivariate data. Even after complexities attributable to the design phase (addressed in our first paper) have been accounted for, the management and manipulation of the time series elements (the preprocessing steps in particular) are issues that are not present in a traditional multivariate modeling paradigm. In our methods, we present the issues that arise from the time series data elements: defining a reference time; imputing and reducing time series data in order to conform to a predefined structure that was specified during the design phase; and normalizing variable families rather than individual variable instances. The final manuscript, entitled: "Using Time-Series Analysis to Predict Cardiac Arrest in a Pediatric Intensive Care Unit" presents the results that were obtained by applying the theoretical construct and its associated methods (detailed in the first two papers) to the case of cardiac arrest prediction in a pediatric intensive care unit. Our results showed that utilizing the trend analysis from the time series data elements reduced the number of classification errors by 73%. The area under the Receiver Operating Characteristic curve increased from a baseline of 87% to 98% by including the trend analysis. In addition to the performance measures, we were also able to demonstrate that adding raw time series data elements without their associated trend analyses improved classification accuracy as compared to the baseline multivariate model, but diminished classification accuracy as compared to when just the trend analysis features were added (ie, without adding the raw time series data elements). We believe this phenomenon was largely attributable to overfitting, which is known to increase as the ratio of candidate features to class examples rises. Furthermore, although we employed several feature reduction strategies to counteract the overfitting problem, they failed to improve the performance beyond that which was achieved by exclusion of the raw time series elements. Finally, our data demonstrated that pulse oximetry and systolic blood pressure readings tend to start diminishing about 10-20 minutes before an arrest, whereas heart rates tend to diminish rapidly less than 5 minutes before an arrest.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objective::Describe and understand regional differences and associated multilevel factors (patient, provider and regional) to inappropriate utilization of advance imaging tests in the privately insured population of Texas. Methods: We analyzed Blue Cross Blue Shield of Texas claims dataset to study the advance imaging utilization during 2008-2010 in the PPO/PPO+ plans. We used three of CMS "Hospital Outpatient Quality Reporting" imaging efficiency measures. These included ordering MRI for low back pain without prior conservative management (OP-8) and utilization of combined with and without contrast abdominal CT (OP-10) and thorax CT (OP-11). Means and variation by hospital referral regions (HRR) in Texas were measured and a multilevel logistic regression for being a provider with high values for any the three OP measures was used in the analysis. We also analyzed OP-8 at the individual level. A multilevel logistic regression was used to identify predictive factors for having an inappropriate MRI for low back pain. Results: Mean OP-8 for Texas providers was 37.89%, OP-10 was 29.94% and OP-11 was 9.24%. Variation was higher for CT measure. And certain HRRs were consistently above the mean. Hospital providers had higher odds of high OP-8 values (OP-8: OR, 1.34; CI, 1.12-1.60) but had smaller odds of having high OP-10 and OP-11 values (OP-10: OR, 0.15; CI, 0.12-0.18; OP-11: OR, 0.43; CI, 0.34-0.53). Providers with the highest volume of imaging studies performed, were less likely to have high OP-8 measures (OP-8: OR, 0.58; CI, 0.48-0.70) but more likely to perform combined thoracic CT scans (OP-11: OR, 1.62; CI, 1.34-1.95). Males had higher odds of inappropriate MRI (OR, 1.21; CI, 1.16-1.26). Pattern of care in the six months prior to the MRI event was significantly associated with having an inappropriate MRI. Conclusion::We identified a significant variation in advance imaging utilization across Texas. Type of facility was associated with measure performance, but the associations differ according to the type of study. Last, certain individual characteristics such as gender, age and pattern of care were found to be predictors of inappropriate MRIs.^