2 resultados para data reports

em DigitalCommons@The Texas Medical Center


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Brain tumor is one of the most aggressive types of cancer in humans, with an estimated median survival time of 12 months and only 4% of the patients surviving more than 5 years after disease diagnosis. Until recently, brain tumor prognosis has been based only on clinical information such as tumor grade and patient age, but there are reports indicating that molecular profiling of gliomas can reveal subgroups of patients with distinct survival rates. We hypothesize that coupling molecular profiling of brain tumors with clinical information might improve predictions of patient survival time and, consequently, better guide future treatment decisions. In order to evaluate this hypothesis, the general goal of this research is to build models for survival prediction of glioma patients using DNA molecular profiles (U133 Affymetrix gene expression microarrays) along with clinical information. First, a predictive Random Forest model is built for binary outcomes (i.e. short vs. long-term survival) and a small subset of genes whose expression values can be used to predict survival time is selected. Following, a new statistical methodology is developed for predicting time-to-death outcomes using Bayesian ensemble trees. Due to a large heterogeneity observed within prognostic classes obtained by the Random Forest model, prediction can be improved by relating time-to-death with gene expression profile directly. We propose a Bayesian ensemble model for survival prediction which is appropriate for high-dimensional data such as gene expression data. Our approach is based on the ensemble "sum-of-trees" model which is flexible to incorporate additive and interaction effects between genes. We specify a fully Bayesian hierarchical approach and illustrate our methodology for the CPH, Weibull, and AFT survival models. We overcome the lack of conjugacy using a latent variable formulation to model the covariate effects which decreases computation time for model fitting. Also, our proposed models provides a model-free way to select important predictive prognostic markers based on controlling false discovery rates. We compare the performance of our methods with baseline reference survival methods and apply our methodology to an unpublished data set of brain tumor survival times and gene expression data, selecting genes potentially related to the development of the disease under study. A closing discussion compares results obtained by Random Forest and Bayesian ensemble methods under the biological/clinical perspectives and highlights the statistical advantages and disadvantages of the new methodology in the context of DNA microarray data analysis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Few recent estimates of childhood asthma incidence exist in the literature, although the importance of incidence surveillance for understanding asthma risk factors has been recognized. Asthma prevalence, morbidity and mortality reports have repeatedly shown that low-income children are disproportionately impacted by the disease. The aim of this study was to demonstrate the utility of Medicaid claims data for providing statewide estimates of asthma incidence. Medicaid Analytic Extract (MAX) data for Texas children ages 0-17 enrolled in Medicaid between 2004 and 2007 were used to estimate incidence overall and by age group, gender, race and county of residence. A 13+ month period of continuous enrollment was required in order to distinguish incident from prevalent cases identified in the claims data. Age-adjusted incidence of asthma was 4.26/100 person-years during 2005-2007, higher than reported in other populations. Incidence rates decreased with age, were higher for males than females, differed by race, and tended to be higher in rural than urban areas. With this study, we were able to demonstrate the utility of MAX data for estimating asthma incidence, and create a dataset of incident cases to use in further analysis. ^ In subsequent analyses, we investigated a possible association between ambient air pollutants and incident asthma among Medicaid-enrolled children in Harris County Texas between 2005 and 2007. This population is at high risk for asthma, and living in an area with historically poor air quality. We used a time-stratified case-crossover design and conditional logistic regression to calculate odds ratios, adjusted for weather variables and aeroallergens, to assess the effect of increases in ozone, NO2 and PM2.5 concentrations on risk of developing asthma. Our results show that a 10 ppb increase in ozone was significantly associated with asthma during the warm season (May-October), with the strongest effect seen when a 6-day cumulative lag period was used to compute the exposure metric (OR=1.05, 95% CI, 1.02–1.08). Similar results were seen for NO2 and PM 2.5 (OR=1.07, 95% CI, 1.03–1.11 and OR=1.12, 95% CI, 1.03–1.22, respectively). PM2.5 also had significant effects in the cold season (November-April), 5-day cumulative lag: OR=1.11, 95% CI, 1.00–1.22. When compared with children in the lowest quartile of O3 exposure, the risk for children in the highest quartile was 20% higher. This study indicates that these pollutants are associated with newly-diagnosed childhood asthma in this low-income urban population, particularly during the summer months. ^