5 resultados para Future value prediction

em DigitalCommons@The Texas Medical Center


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Brain tumor is one of the most aggressive types of cancer in humans, with an estimated median survival time of 12 months and only 4% of the patients surviving more than 5 years after disease diagnosis. Until recently, brain tumor prognosis has been based only on clinical information such as tumor grade and patient age, but there are reports indicating that molecular profiling of gliomas can reveal subgroups of patients with distinct survival rates. We hypothesize that coupling molecular profiling of brain tumors with clinical information might improve predictions of patient survival time and, consequently, better guide future treatment decisions. In order to evaluate this hypothesis, the general goal of this research is to build models for survival prediction of glioma patients using DNA molecular profiles (U133 Affymetrix gene expression microarrays) along with clinical information. First, a predictive Random Forest model is built for binary outcomes (i.e. short vs. long-term survival) and a small subset of genes whose expression values can be used to predict survival time is selected. Following, a new statistical methodology is developed for predicting time-to-death outcomes using Bayesian ensemble trees. Due to a large heterogeneity observed within prognostic classes obtained by the Random Forest model, prediction can be improved by relating time-to-death with gene expression profile directly. We propose a Bayesian ensemble model for survival prediction which is appropriate for high-dimensional data such as gene expression data. Our approach is based on the ensemble "sum-of-trees" model which is flexible to incorporate additive and interaction effects between genes. We specify a fully Bayesian hierarchical approach and illustrate our methodology for the CPH, Weibull, and AFT survival models. We overcome the lack of conjugacy using a latent variable formulation to model the covariate effects which decreases computation time for model fitting. Also, our proposed models provides a model-free way to select important predictive prognostic markers based on controlling false discovery rates. We compare the performance of our methods with baseline reference survival methods and apply our methodology to an unpublished data set of brain tumor survival times and gene expression data, selecting genes potentially related to the development of the disease under study. A closing discussion compares results obtained by Random Forest and Bayesian ensemble methods under the biological/clinical perspectives and highlights the statistical advantages and disadvantages of the new methodology in the context of DNA microarray data analysis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objectives. Cardiovascular disease (CVD) including CVD secondary to diabetes type II, a significant health problem among Mexican American populations, originates in early childhood. This study seeks to determine risk factors available to the health practitioner that can identify the child at potential risk of developing CVD, thereby enabling early intervention. ^ Design. This is a secondary analysis of cross-sectional data of matched Mexican American parents and children selected from the HHANES, 1982–1984. ^ Methods. Parents at high risk for CVD were identified based on medical history, and clinical and physical findings. Factor analysis was performed on children's skinfold thicknesses, height, weight, and systolic and diastolic blood pressures, in order to produce a limited number of uncorrelated child CVD risk factors. Multiple regression analyses were then performed to determine other CVD markers associated with these Factors, independently for mothers and fathers. ^ Results. Factor analysis of children's measurements revealed three uncorrelated latent variables summarizing the children's CVD risk: Factor1: ‘Fatness’, Factor2: ‘Size and Maturity’, and Factor3: ‘Blood Pressure’, together accounting for the bulk of variation in children's measurements (86–89%). Univariate analyses showed that children from high CVD risk families did not differ from children of low risk families in occurrence of high blood pressure, overweight, biological maturity, acculturation score, or social and economic indicators. However, multiple regression using the factor scores (from factor analysis) as dependent variables, revealed that higher CVD risk in parents, was significantly associated with increased fatness and increased blood pressure in the children. Father's CVD risk status was associated with higher levels of body fat in his children and higher levels of blood pressure in sons. Mother's CVD risk status was associated with higher blood pressure levels in children, and occurrence of obesity in the mother associated with higher fatness levels in her children. ^ Conclusion. Occurrence of cardiovascular disease and its risk factors in parents of Mexican American children, may be used to identify children at potentially higher risk for developing CV disease in the future. Obesity in mothers appears to be an important marker for the development of higher levels of body fatness in children. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Strategies are compared for the development of a linear regression model with stochastic (multivariate normal) regressor variables and the subsequent assessment of its predictive ability. Bias and mean squared error of four estimators of predictive performance are evaluated in simulated samples of 32 population correlation matrices. Models including all of the available predictors are compared with those obtained using selected subsets. The subset selection procedures investigated include two stopping rules, C$\sb{\rm p}$ and S$\sb{\rm p}$, each combined with an 'all possible subsets' or 'forward selection' of variables. The estimators of performance utilized include parametric (MSEP$\sb{\rm m}$) and non-parametric (PRESS) assessments in the entire sample, and two data splitting estimates restricted to a random or balanced (Snee's DUPLEX) 'validation' half sample. The simulations were performed as a designed experiment, with population correlation matrices representing a broad range of data structures.^ The techniques examined for subset selection do not generally result in improved predictions relative to the full model. Approaches using 'forward selection' result in slightly smaller prediction errors and less biased estimators of predictive accuracy than 'all possible subsets' approaches but no differences are detected between the performances of C$\sb{\rm p}$ and S$\sb{\rm p}$. In every case, prediction errors of models obtained by subset selection in either of the half splits exceed those obtained using all predictors and the entire sample.^ Only the random split estimator is conditionally (on $\\beta$) unbiased, however MSEP$\sb{\rm m}$ is unbiased on average and PRESS is nearly so in unselected (fixed form) models. When subset selection techniques are used, MSEP$\sb{\rm m}$ and PRESS always underestimate prediction errors, by as much as 27 percent (on average) in small samples. Despite their bias, the mean squared errors (MSE) of these estimators are at least 30 percent less than that of the unbiased random split estimator. The DUPLEX split estimator suffers from large MSE as well as bias, and seems of little value within the context of stochastic regressor variables.^ To maximize predictive accuracy while retaining a reliable estimate of that accuracy, it is recommended that the entire sample be used for model development, and a leave-one-out statistic (e.g. PRESS) be used for assessment. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the current climate of escalating health care costs, defining value and accurately measuring it are two critical issues affecting not only the future of cancer care in particular but also the future of health care in general. Specifically, measuring and improving value in cancer-related health care are critical for continued advancements in research, management, and overall delivery of care. However, in oncology, most of this research has focused on value as it relates to insurance industry and payment reform, with little attention paid to value as the output of clinical interventions that encompass integrated clinical teams focusing on the entire cycle of care and measuring objective outcomes that are most relevant to patients. ^ In this study, patient-centered value was defined as health outcomes achieved per dollar spent, and calculated using objective functional outcomes and total care costs. The analytic sample comprised patients diagnosed with three common head and neck cancers—cancer of the larynx, oral cavity, and oropharynx—who were treated in an integrated tertiary care center over an approximately 10-year period. The results of this study provide initial empirical data that can be used to assess and ultimately to help improve the quality and value of head and neck cancer care, and more importantly they can be used by patients and clinicians to make better-informed decisions about care, particularly what therapeutic services and outcomes matter the most to patients.^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Head and Neck Squamous Cell Carcinoma (HNSCC) is the sixth common malignancy in the world, with high rates of developing second primary malignancy (SPM) and moderately low survival rates. This disease has become an enormous challenge in the cancer research and treatments. For HNSCC patients, a highly significant cause of post-treatment mortality and morbidity is the development of SPM. Hence, assessment of predicting the risk for the development of SPM would be very helpful for patients, clinicians and policy makers to estimate the survival of patients with HNSCC. In this study, we built a prognostic model to predict the risk of developing SPM in patients with newly diagnosed HNSCC. The dataset used in this research was obtained from The University of Texas MD Anderson Cancer Center. For the first aim, we used stepwise logistic regression to identify the prognostic factors for the development of SPM. Our final model contained cancer site and overall cancer stage as our risk factors for SPM. The Hosmer-Lemeshow test (p-value= 0.15>0.05) showed the final prognostic model fit the data well. The area under the ROC curve was 0.72 that suggested the discrimination ability of our model was acceptable. The internal validation confirmed the prognostic model was a good fit and the final prognostic model would not over optimistically predict the risk of SPM. This model needs external validation by using large data sample size before it can be generalized to predict SPM risk for other HNSCC patients. For the second aim, we utilized a multistate survival analysis approach to estimate the probability of death for HNSCC patients taking into consideration of the possibility of SPM. Patients without SPM were associated with longer survival. These findings suggest that the development of SPM could be a predictor of survival rates among the patients with HNSCC.^