7 resultados para data-types

em DigitalCommons@The Texas Medical Center


Relevância:

70.00% 70.00%

Publicador:

Resumo:

The purpose of this study is to investigate the effects of predictor variable correlations and patterns of missingness with dichotomous and/or continuous data in small samples when missing data is multiply imputed. Missing data of predictor variables is multiply imputed under three different multivariate models: the multivariate normal model for continuous data, the multinomial model for dichotomous data and the general location model for mixed dichotomous and continuous data. Subsequent to the multiple imputation process, Type I error rates of the regression coefficients obtained with logistic regression analysis are estimated under various conditions of correlation structure, sample size, type of data and patterns of missing data. The distributional properties of average mean, variance and correlations among the predictor variables are assessed after the multiple imputation process. ^ For continuous predictor data under the multivariate normal model, Type I error rates are generally within the nominal values with samples of size n = 100. Smaller samples of size n = 50 resulted in more conservative estimates (i.e., lower than the nominal value). Correlation and variance estimates of the original data are retained after multiple imputation with less than 50% missing continuous predictor data. For dichotomous predictor data under the multinomial model, Type I error rates are generally conservative, which in part is due to the sparseness of the data. The correlation structure for the predictor variables is not well retained on multiply-imputed data from small samples with more than 50% missing data with this model. For mixed continuous and dichotomous predictor data, the results are similar to those found under the multivariate normal model for continuous data and under the multinomial model for dichotomous data. With all data types, a fully-observed variable included with variables subject to missingness in the multiple imputation process and subsequent statistical analysis provided liberal (larger than nominal values) Type I error rates under a specific pattern of missing data. It is suggested that future studies focus on the effects of multiple imputation in multivariate settings with more realistic data characteristics and a variety of multivariate analyses, assessing both Type I error and power. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Brain tumor is one of the most aggressive types of cancer in humans, with an estimated median survival time of 12 months and only 4% of the patients surviving more than 5 years after disease diagnosis. Until recently, brain tumor prognosis has been based only on clinical information such as tumor grade and patient age, but there are reports indicating that molecular profiling of gliomas can reveal subgroups of patients with distinct survival rates. We hypothesize that coupling molecular profiling of brain tumors with clinical information might improve predictions of patient survival time and, consequently, better guide future treatment decisions. In order to evaluate this hypothesis, the general goal of this research is to build models for survival prediction of glioma patients using DNA molecular profiles (U133 Affymetrix gene expression microarrays) along with clinical information. First, a predictive Random Forest model is built for binary outcomes (i.e. short vs. long-term survival) and a small subset of genes whose expression values can be used to predict survival time is selected. Following, a new statistical methodology is developed for predicting time-to-death outcomes using Bayesian ensemble trees. Due to a large heterogeneity observed within prognostic classes obtained by the Random Forest model, prediction can be improved by relating time-to-death with gene expression profile directly. We propose a Bayesian ensemble model for survival prediction which is appropriate for high-dimensional data such as gene expression data. Our approach is based on the ensemble "sum-of-trees" model which is flexible to incorporate additive and interaction effects between genes. We specify a fully Bayesian hierarchical approach and illustrate our methodology for the CPH, Weibull, and AFT survival models. We overcome the lack of conjugacy using a latent variable formulation to model the covariate effects which decreases computation time for model fitting. Also, our proposed models provides a model-free way to select important predictive prognostic markers based on controlling false discovery rates. We compare the performance of our methods with baseline reference survival methods and apply our methodology to an unpublished data set of brain tumor survival times and gene expression data, selecting genes potentially related to the development of the disease under study. A closing discussion compares results obtained by Random Forest and Bayesian ensemble methods under the biological/clinical perspectives and highlights the statistical advantages and disadvantages of the new methodology in the context of DNA microarray data analysis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The current state of health and biomedicine includes an enormity of heterogeneous data ‘silos’, collected for different purposes and represented differently, that are presently impossible to share or analyze in toto. The greatest challenge for large-scale and meaningful analyses of health-related data is to achieve a uniform data representation for data extracted from heterogeneous source representations. Based upon an analysis and categorization of heterogeneities, a process for achieving comparable data content by using a uniform terminological representation is developed. This process addresses the types of representational heterogeneities that commonly arise in healthcare data integration problems. Specifically, this process uses a reference terminology, and associated "maps" to transform heterogeneous data to a standard representation for comparability and secondary use. The capture of quality and precision of the “maps” between local terms and reference terminology concepts enhances the meaning of the aggregated data, empowering end users with better-informed queries for subsequent analyses. A data integration case study in the domain of pediatric asthma illustrates the development and use of a reference terminology for creating comparable data from heterogeneous source representations. The contribution of this research is a generalized process for the integration of data from heterogeneous source representations, and this process can be applied and extended to other problems where heterogeneous data needs to be merged.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

People often use tools to search for information. In order to improve the quality of an information search, it is important to understand how internal information, which is stored in user’s mind, and external information, represented by the interface of tools interact with each other. How information is distributed between internal and external representations significantly affects information search performance. However, few studies have examined the relationship between types of interface and types of search task in the context of information search. For a distributed information search task, how data are distributed, represented, and formatted significantly affects the user search performance in terms of response time and accuracy. Guided by UFuRT (User, Function, Representation, Task), a human-centered process, I propose a search model, task taxonomy. The model defines its relationship with other existing information models. The taxonomy clarifies the legitimate operations for each type of search task of relation data. Based on the model and taxonomy, I have also developed prototypes of interface for the search tasks of relational data. These prototypes were used for experiments. The experiments described in this study are of a within-subject design with a sample of 24 participants recruited from the graduate schools located in the Texas Medical Center. Participants performed one-dimensional nominal search tasks over nominal, ordinal, and ratio displays, and searched one-dimensional nominal, ordinal, interval, and ratio tasks over table and graph displays. Participants also performed the same task and display combination for twodimensional searches. Distributed cognition theory has been adopted as a theoretical framework for analyzing and predicting the search performance of relational data. It has been shown that the representation dimensions and data scales, as well as the search task types, are main factors in determining search efficiency and effectiveness. In particular, the more external representations used, the better search task performance, and the results suggest the ideal search performance occurs when the question type and corresponding data scale representation match. The implications of the study lie in contributing to the effective design of search interface for relational data, especially laboratory results, which are often used in healthcare activities.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The hypothesis to be tested is that there are two distinct types of chronic responses in irradiated normal tissues, each resulting from damage to different cell populations in the tissue. The first is a sequala of chronic epithelial depletion in which the tissue's integrity cannot be maintained, i.e. a "consequential" chronic response. The other response is due to cell loss in the connective tissue and/or vascular stroma, i.e. a "primary" chronic response. The purpose of this study was to test the hypothesis in the murine colon by first, establishing a model of each chronic response and then, by determining whether the responses differed in timing of expression, histology, and expression of specific collagen types. The model of late damage used was colonic obstructions/strictures induced by a single dose of 27 Gy ("consequential" response) and two equal doses of 14.75 Gy (t = 10 days) ("primary" response). "Consequential" lesions appeared as early as 5 weeks after 27 Gy and were characterized by a deep mucosal ulceration and a thickened fibrotic serosa containing excessive accumulations of collagen types I and III. Both types were commingled in the scar at the base of the ulcer. Fibroblasts were synthesizing pro-collagen types I and III mRNA 10 weeks prior to measurable increases in collagen. A significant decrease in the ratio of collagen types I:III was associated with the "consequential" response at 4-5 months post-irradiation. The "primary" response, on the other hand, did not appear until 40 weeks after the split dose even though the total dose delivered was approximately the same as that for the "consequential" response. The "primary" response was characterized with an intact mucosa and a thickened fibrotic submucosa which contained excessive amounts of only collagen type I. An increased number of fibroblasts were synthesizing pro-collagen type I mRNA nearly 25 weeks before collagen type I levels were increased. The "primary" response lesion had a significantly elevated collagen type I:III ratio at 10-13 months post-irradiation. These data show a clear difference between the two chronic response and suggest that not all chronic responses share a common pathogenesis, but depend on the cell population in the tissue that is damaged. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Approximately 6,600 people die from acute myelogenous leukemia (AML) on an annual basis. During the past 10 to 15 years, there has been gradual overall improvements in the therapy of this disease, yet the majority of patients with AML succumb to this disease. In an attempt to improve current therapeutic strategies for AML, we became interested in a commercially available drug, dexrazoxane, which protects against anthracycline-induced cardiotoxicity. We have investigated dexrazoxane's (DEX) effects on different tissue types in an effort to determine its unique mechanism of action. Colony forming assays were used to evaluate stem-cell renewal of myeloid cells in vitro and median effect analysis was used to evaluate antagonism, synergism, or additivity. The anthracyclines, doxorubicin, daunorubicin, and idarubicin were individually combined with DEX in leukemic myeloid models to determine if the combination of the two drugs resulted in a synergistic, additive or antagonistic effect. Etoposide and cytosine arabinoside were also evaluated in combination with DEX using the same in vitro model and evaluated similarly. ^ Dexrazoxane in combination with any of the anthracyclines was schedule dependent. The combination of DEX and anthracycline resulted in a greater antitumor effect than anthracycline alone except for DEX administered 24 hours before doxorubicin or daunorubicin. These data were corroborated through median effect analysis. Etoposide in combination with dexrazoxane was synergistic for all combinations, and the combination of cytosine arabinoside and DEX was schedule dependent. In contrast, using an in vivo gastrointestinal model, DEX in combination with doxorubicin was antagonistic for almost all of the ratios used, except for the highest. A Withers' assay was used to evaluate toxicity on jejunal crypt cells. No effect was apparent for the combination of idarubicin and DEX, however, as seen with RZ, DEX in addition to radiation greatly potentiated the cytotoxic effects of radiation on crypts. These paradoxical effects of dexrazoxane were initially enigmatic, but after additional investigation, we propose a model that explains our findings. We conclude that DEX in combination with anthracyclines produces an additive to synergistic antileukemic response and may have therapeutic potential clinically. Additionally, DEX protects the gastrointestinal tract from doxorubicin toxicity, which could have clinical implications for the administration of greater doses of doxorubicin. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Cancer is the second leading cause of death in the United States. With the advent of new technologies, changes in health care delivery, and multiplicity of provider types that patients must see, cancer care management has become increasingly complex. The availability of cancer health information has been shown to help cancer patients cope with the management and effects of their cancers. As a result, more cancer patients are using the internet to find resources that can aid in decision-making and recovery. ^ The Health Information National Trends Survey (HINTS) is a nationally representative survey designed to collect information about the experiences of cancer and non-cancer adults with health information sources. The HINTS survey focused on both conventional sources as well as newer technologies, particularly the internet. This study is a descriptive analysis of the HINTS 2003 and HINTS 2005 survey data. The purpose of the research is to explore the general trends in health information seeking and use by US adults, and especially by cancer patients. ^ From 2003 to 2005, internet use for various health-related activities appears to have increased among adults with and without cancer. Differences were found between the groups in the general trust in information media, particularly the internet. Non-cancer respondents tended to have greater trust in information media than cancer respondents. ^ The latter portion of this work examined characteristics of HINTS respondents that were thought to be relevant to how much trust individuals placed in the internet as a source of health information. Trust in health information from the internet was significantly greater among younger adults, higher-earning households, internet users, online seekers of health or cancer information, and those who found online cancer information useful. ^