5 resultados para US Minimum Data Set
em DigitalCommons@The Texas Medical Center
Resumo:
Brain tumor is one of the most aggressive types of cancer in humans, with an estimated median survival time of 12 months and only 4% of the patients surviving more than 5 years after disease diagnosis. Until recently, brain tumor prognosis has been based only on clinical information such as tumor grade and patient age, but there are reports indicating that molecular profiling of gliomas can reveal subgroups of patients with distinct survival rates. We hypothesize that coupling molecular profiling of brain tumors with clinical information might improve predictions of patient survival time and, consequently, better guide future treatment decisions. In order to evaluate this hypothesis, the general goal of this research is to build models for survival prediction of glioma patients using DNA molecular profiles (U133 Affymetrix gene expression microarrays) along with clinical information. First, a predictive Random Forest model is built for binary outcomes (i.e. short vs. long-term survival) and a small subset of genes whose expression values can be used to predict survival time is selected. Following, a new statistical methodology is developed for predicting time-to-death outcomes using Bayesian ensemble trees. Due to a large heterogeneity observed within prognostic classes obtained by the Random Forest model, prediction can be improved by relating time-to-death with gene expression profile directly. We propose a Bayesian ensemble model for survival prediction which is appropriate for high-dimensional data such as gene expression data. Our approach is based on the ensemble "sum-of-trees" model which is flexible to incorporate additive and interaction effects between genes. We specify a fully Bayesian hierarchical approach and illustrate our methodology for the CPH, Weibull, and AFT survival models. We overcome the lack of conjugacy using a latent variable formulation to model the covariate effects which decreases computation time for model fitting. Also, our proposed models provides a model-free way to select important predictive prognostic markers based on controlling false discovery rates. We compare the performance of our methods with baseline reference survival methods and apply our methodology to an unpublished data set of brain tumor survival times and gene expression data, selecting genes potentially related to the development of the disease under study. A closing discussion compares results obtained by Random Forest and Bayesian ensemble methods under the biological/clinical perspectives and highlights the statistical advantages and disadvantages of the new methodology in the context of DNA microarray data analysis.
Resumo:
BACKGROUND: Physician advice is an important motivator for attempting to stop smoking. However, physicians' lack of intervention with smokers has only modestly improved in the last decade. Although the literature includes extensive research in the area of the smoking intervention practices of clinicians, few studies have focused on Hispanic physicians. The purpose of this study was to explore the correlates of tobacco cessation counseling practices among Hispanic physicians in the US. METHODS: Data were collected through a validated survey instrument among a cross-sectional sample of self-reported Hispanic physicians practicing in New Mexico, and who were members of the New Mexico Hispanic Medical Society in the year 2001. Domains of interest included counseling practices, self-efficacy, attitudes/responsibility, and knowledge/skills. Returned surveys were analyzed to obtain frequencies and descriptive statistics for each survey item. Other analyses included: bivariate Pearson's correlation, factorial ANOVAs, and multiple linear regressions. RESULTS: Respondents (n = 45) reported a low level of compliance with tobacco control guidelines and recommendations. Results indicate that physicians' familiarity with standard cessation protocols has a significant effect on their tobacco-related practices (r = .35, variance shared = 12%). Self-efficacy and gender were both significantly correlated to tobacco related practices (r = .42, variance shared = 17%). A significant correlation was also found between self-efficacy and knowledge/skills (r = .60, variance shared = 36%). Attitudes/responsibility was not significantly correlated with any of the other measures. CONCLUSION: More resources should be dedicated to training Hispanic physicians in tobacco intervention. Training may facilitate practice by increasing knowledge, developing skills and, ultimately, enhancing feelings of self-efficacy.
Resumo:
The motion of lung tumors during respiration makes the accurate delivery of radiation therapy to the thorax difficult because it increases the uncertainty of target position. The adoption of four-dimensional computed tomography (4D-CT) has allowed us to determine how a tumor moves with respiration for each individual patient. Using information acquired during a 4D-CT scan, we can define the target, visualize motion, and calculate dose during the planning phase of the radiotherapy process. One image data set that can be created from the 4D-CT acquisition is the maximum-intensity projection (MIP). The MIP can be used as a starting point to define the volume that encompasses the motion envelope of the moving gross target volume (GTV). Because of the close relationship that exists between the MIP and the final target volume, we investigated four MIP data sets created with different methodologies (3 using various 4D-CT sorting implementations, and one using all available cine CT images) to compare target delineation. It has been observed that changing the 4D-CT sorting method will lead to the selection of a different collection of images; however, the clinical implications of changing the constituent images on the resultant MIP data set are not clear. There has not been a comprehensive study that compares target delineation based on different 4D-CT sorting methodologies in a patient population. We selected a collection of patients who had previously undergone thoracic 4D-CT scans at our institution, and who had lung tumors that moved at least 1 cm. We then generated the four MIP data sets and automatically contoured the target volumes. In doing so, we identified cases in which the MIP generated from a 4D-CT sorting process under-represented the motion envelope of the target volume by more than 10% than when measured on the MIP generated from all of the cine CT images. The 4D-CT methods suffered from duplicate image selection and might not choose maximum extent images. Based on our results, we suggest utilization of a MIP generated from the full cine CT data set to ensure a representative inclusive tumor extent, and to avoid geometric miss.
Resumo:
Academic and industrial research in the late 90s have brought about an exponential explosion of DNA sequence data. Automated expert systems are being created to help biologists to extract patterns, trends and links from this ever-deepening ocean of information. Two such systems aimed on retrieving and subsequently utilizing phylogenetically relevant information have been developed in this dissertation, the major objective of which was to automate the often difficult and confusing phylogenetic reconstruction process. ^ Popular phylogenetic reconstruction methods, such as distance-based methods, attempt to find an optimal tree topology (that reflects the relationships among related sequences and their evolutionary history) by searching through the topology space. Various compromises between the fast (but incomplete) and exhaustive (but computationally prohibitive) search heuristics have been suggested. An intelligent compromise algorithm that relies on a flexible “beam” search principle from the Artificial Intelligence domain and uses the pre-computed local topology reliability information to adjust the beam search space continuously is described in the second chapter of this dissertation. ^ However, sometimes even a (virtually) complete distance-based method is inferior to the significantly more elaborate (and computationally expensive) maximum likelihood (ML) method. In fact, depending on the nature of the sequence data in question either method might prove to be superior. Therefore, it is difficult (even for an expert) to tell a priori which phylogenetic reconstruction method—distance-based, ML or maybe maximum parsimony (MP)—should be chosen for any particular data set. ^ A number of factors, often hidden, influence the performance of a method. For example, it is generally understood that for a phylogenetically “difficult” data set more sophisticated methods (e.g., ML) tend to be more effective and thus should be chosen. However, it is the interplay of many factors that one needs to consider in order to avoid choosing an inferior method (potentially a costly mistake, both in terms of computational expenses and in terms of reconstruction accuracy.) ^ Chapter III of this dissertation details a phylogenetic reconstruction expert system that selects a superior proper method automatically. It uses a classifier (a Decision Tree-inducing algorithm) to map a new data set to the proper phylogenetic reconstruction method. ^
Resumo:
Phthalates are industrial chemicals used primarily as plasticizers though they and are found in a myriad of consumer goods such as children's toys, food packaging, dental sealants, cosmetics, pharmaceuticals, perfumes, and building materials. US biomonitoring data show more than 75% of the population have exposure to mono-n-butyl phthalate (MBP), mono-ethyl phthalate (MEP), mono-(2-ethyl) hexyl phthalate (MEHP), and mono-benzyl phthalate (MBZP). Reproductive toxicity from phthalate exposure in animal models has raised concerns about similar effects on fertility in humans. This dissertation research focuses on phthalate exposures in the US population and investigates the plausibility of an exposure-response relationship between phthalates and endocrine hormones essential for ovulation among US women. The objective of this research is to determine the relationship between levels of gonadotropins, follicle stimulating hormone (FSH) and leutinizing hormone (LH), and urinary phthalate monoester metabolites: MBP, MEP, MEHP, MBZP among National Health and Nutrition Examination Survey (NHANES) 1999-2002 women aged 35 to 60 years. Using biomarker data from a one-third sub-sample of NHANES participants, log transformed serum FSH and serum LH, respectively were regressed on phthalates controlling for age, body mass index, smoking, and creatinine taking into consideration the complex survey design (n=385). Models were stratified by reproductive status: reproductive (n=185), menopause transition (n=49) and post-menopausal (n=125). A decrease in FSH associated with increasing MBzP (beta=-0.094, p<0.05) was observed for all participants but no statistical association between log FSH and MBP, MEP, or MEHP was seen. A decrease in LH (beta=-0.125, p<0.05) was also observed with increasing MBzP for all participants though there was no relationship between levels of LH and MBP, MEP, or MEHP. The observed associations between FSH, LH and MBzP did not persist when stratified by reproductive status. Thus, the present study shows a change in endocrine hormones related to ovulation with increasing urinary MBzP among a representative sample of US women from 1999-2002 though this observed exposure-response relationship does not remain after stratification by reproductive status. ^