4 resultados para Eclipse modeling framework (EMF)
em DigitalCommons@The Texas Medical Center
Resumo:
Hodgkin's disease (HD) is a cancer of the lymphatic system. Survivors of HD face varieties of consequent adverse effects, in which secondary primary tumors (SPT) is one of the most serious consequences. This dissertation is aimed to model time-to-SPT in the presence of death and HD relapses during follow-up.^ The model is designed to handle a mixture phenomenon of SPT and the influence of death. Relapses of HD are adjusted as a covariate. Proportional hazards framework is used to define SPT intensity function, which includes an exponential term to estimate explanatory variables. Death as a competing risk is considered according to different scenarios, depending on which terminal event comes first. Newton-Raphson method is used to estimate the parameter estimates in the end.^ The proposed method is applied to a real data set containing a group of HD patients. Several risk factors for the development of SPT are identified and the findings are noteworthy in the development of healthcare guidelines that may lead to the early detection or prevention of SPT.^
Resumo:
Mixture modeling is commonly used to model categorical latent variables that represent subpopulations in which population membership is unknown but can be inferred from the data. In relatively recent years, the potential of finite mixture models has been applied in time-to-event data. However, the commonly used survival mixture model assumes that the effects of the covariates involved in failure times differ across latent classes, but the covariate distribution is homogeneous. The aim of this dissertation is to develop a method to examine time-to-event data in the presence of unobserved heterogeneity under a framework of mixture modeling. A joint model is developed to incorporate the latent survival trajectory along with the observed information for the joint analysis of a time-to-event variable, its discrete and continuous covariates, and a latent class variable. It is assumed that the effects of covariates on survival times and the distribution of covariates vary across different latent classes. The unobservable survival trajectories are identified through estimating the probability that a subject belongs to a particular class based on observed information. We applied this method to a Hodgkin lymphoma study with long-term follow-up and observed four distinct latent classes in terms of long-term survival and distributions of prognostic factors. Our results from simulation studies and from the Hodgkin lymphoma study demonstrated the superiority of our joint model compared with the conventional survival model. This flexible inference method provides more accurate estimation and accommodates unobservable heterogeneity among individuals while taking involved interactions between covariates into consideration.^
Resumo:
The first manuscript, entitled "Time-Series Analysis as Input for Clinical Predictive Modeling: Modeling Cardiac Arrest in a Pediatric ICU" lays out the theoretical background for the project. There are several core concepts presented in this paper. First, traditional multivariate models (where each variable is represented by only one value) provide single point-in-time snapshots of patient status: they are incapable of characterizing deterioration. Since deterioration is consistently identified as a precursor to cardiac arrests, we maintain that the traditional multivariate paradigm is insufficient for predicting arrests. We identify time series analysis as a method capable of characterizing deterioration in an objective, mathematical fashion, and describe how to build a general foundation for predictive modeling using time series analysis results as latent variables. Building a solid foundation for any given modeling task involves addressing a number of issues during the design phase. These include selecting the proper candidate features on which to base the model, and selecting the most appropriate tool to measure them. We also identified several unique design issues that are introduced when time series data elements are added to the set of candidate features. One such issue is in defining the duration and resolution of time series elements required to sufficiently characterize the time series phenomena being considered as candidate features for the predictive model. Once the duration and resolution are established, there must also be explicit mathematical or statistical operations that produce the time series analysis result to be used as a latent candidate feature. In synthesizing the comprehensive framework for building a predictive model based on time series data elements, we identified at least four classes of data that can be used in the model design. The first two classes are shared with traditional multivariate models: multivariate data and clinical latent features. Multivariate data is represented by the standard one value per variable paradigm and is widely employed in a host of clinical models and tools. These are often represented by a number present in a given cell of a table. Clinical latent features derived, rather than directly measured, data elements that more accurately represent a particular clinical phenomenon than any of the directly measured data elements in isolation. The second two classes are unique to the time series data elements. The first of these is the raw data elements. These are represented by multiple values per variable, and constitute the measured observations that are typically available to end users when they review time series data. These are often represented as dots on a graph. The final class of data results from performing time series analysis. This class of data represents the fundamental concept on which our hypothesis is based. The specific statistical or mathematical operations are up to the modeler to determine, but we generally recommend that a variety of analyses be performed in order to maximize the likelihood that a representation of the time series data elements is produced that is able to distinguish between two or more classes of outcomes. The second manuscript, entitled "Building Clinical Prediction Models Using Time Series Data: Modeling Cardiac Arrest in a Pediatric ICU" provides a detailed description, start to finish, of the methods required to prepare the data, build, and validate a predictive model that uses the time series data elements determined in the first paper. One of the fundamental tenets of the second paper is that manual implementations of time series based models are unfeasible due to the relatively large number of data elements and the complexity of preprocessing that must occur before data can be presented to the model. Each of the seventeen steps is analyzed from the perspective of how it may be automated, when necessary. We identify the general objectives and available strategies of each of the steps, and we present our rationale for choosing a specific strategy for each step in the case of predicting cardiac arrest in a pediatric intensive care unit. Another issue brought to light by the second paper is that the individual steps required to use time series data for predictive modeling are more numerous and more complex than those used for modeling with traditional multivariate data. Even after complexities attributable to the design phase (addressed in our first paper) have been accounted for, the management and manipulation of the time series elements (the preprocessing steps in particular) are issues that are not present in a traditional multivariate modeling paradigm. In our methods, we present the issues that arise from the time series data elements: defining a reference time; imputing and reducing time series data in order to conform to a predefined structure that was specified during the design phase; and normalizing variable families rather than individual variable instances. The final manuscript, entitled: "Using Time-Series Analysis to Predict Cardiac Arrest in a Pediatric Intensive Care Unit" presents the results that were obtained by applying the theoretical construct and its associated methods (detailed in the first two papers) to the case of cardiac arrest prediction in a pediatric intensive care unit. Our results showed that utilizing the trend analysis from the time series data elements reduced the number of classification errors by 73%. The area under the Receiver Operating Characteristic curve increased from a baseline of 87% to 98% by including the trend analysis. In addition to the performance measures, we were also able to demonstrate that adding raw time series data elements without their associated trend analyses improved classification accuracy as compared to the baseline multivariate model, but diminished classification accuracy as compared to when just the trend analysis features were added (ie, without adding the raw time series data elements). We believe this phenomenon was largely attributable to overfitting, which is known to increase as the ratio of candidate features to class examples rises. Furthermore, although we employed several feature reduction strategies to counteract the overfitting problem, they failed to improve the performance beyond that which was achieved by exclusion of the raw time series elements. Finally, our data demonstrated that pulse oximetry and systolic blood pressure readings tend to start diminishing about 10-20 minutes before an arrest, whereas heart rates tend to diminish rapidly less than 5 minutes before an arrest.
Resumo:
Even the best school health education programs will be unsuccessful if they are not disseminated effectively in a manner that encourages classroom adoption and implementation. This study involved two components: (1) the development of a videotape intervention to be used in the dissemination phase of a 4-year, NCI-funded diffusion study and (2) the evaluation of that videotape intervention strategy in comparison with a print (information transfer) strategy. Conceptualization has been guided by Social Learning Theory, Diffusion Theory, and communication theory. Additionally, the PRECEDE Framework has been used. Seventh and 8th grade classroom teachers from Spring Branch Independent School District in west Houston participated in the evaluation of the videotape and print interventions using a 57-item preadoption survey instrument developed by the UT Center for Health Promotion Research and Development. Two-way ANOVA was used to study individual score differences for five outcome variables: Total Scale Score (comprised of 57 predisposing, enabling, and reinforcing items), Adoption Characteristics Subscale, Attitude Toward Innovation Subscale, Receptivity Toward Innovation, and Reinforcement Subscale. The aim of the study is to compare the effect upon score differences of video and print interventions alone and in combination. Seventy-three 7th and 8th grade classroom teachers completed the study providing baseline and post-intervention measures on factors related to the adoption and implementation of tobacco-use prevention programs. Two-way ANOVA, in relation to the study questions, found significant scoring differences for those exposed to the videotape intervention alone for both the Attitude Toward Innovation Subscale and the Receptivity to Adopt Subscale. No significant results were found to suggest that print alone influences favorable scoring differences between baseline and post-intervention testing. One interaction effect was found suggesting video and print combined are more effective for influencing favorable scoring differences for the Reinforcement for the Adoption Subscale.^ This research is unique in that it represents a newly emerging field in health promotion communications research with implications for Social Learning Theory, Diffusion Theory, and communication science that are applicable to the development of improved school health interventions. ^