2 resultados para Background Traffic Modeling
em DigitalCommons@The Texas Medical Center
Resumo:
Colorectal cancer is a complex disease that is thought to arise when cells accumulate mutations that allow for uncontrolled growth. There are several recognized mechanisms for generating such mutations in sporadic colon cancer; one of which is chromosomal instability (CIN). One hypothesized driver of CIN in cancer is the improper repair of dysfunctional telomeres. Telomeres comprise the linear ends of chromosomes and play a dual role in cancer. Its length is maintained by the ribonucleoprotein, telomerase, which is not a normally expressed in somatic cells and as cells divide, telomeres continuously shorten. Critically shortened telomeres are considered dysfunctional as they are recognized as sites of DNA damage and cells respond by entering into replicative senescence or apoptosis, a process that is p53-dependent and the mechanism for telomere-induced tumor suppression. Loss of this checkpoint and improper repair of dysfunctional telomeres can initiate a cycle of fusion, bridge and breakage that can lead to chromosomal changes and genomic instability, a process that can lead to transformation of normal cells to cancer cells. Mouse models of telomere dysfunction are currently based on knocking out the telomerase protein or RNA component; however, the naturally long telomeres of mice require multiple generational crosses of telomerase null mice to achieve critically short telomeres. Shelterin is a complex of six core proteins that bind to telomeres specifically. Pot1a is a highly conserved member of this complex that specifically binds to the telomeric single-stranded 3’ G-rich overhang. Previous work in our lab has shown that Pot1a is essential for chromosomal end protection as deletion of Pot1a in murine embryonic fibroblasts (MEFs) leads to open telomere ends that initiate a DNA damage response mediated by ATR, resulting in p53-dependent cellular senescence. Loss of Pot1a in the background of p53 deficiency results in increased aberrant homologous recombination at telomeres and elevated genomic instability, which allows Pot1a-/-, p53-/- MEFs to form tumors when injected into SCID mice. These phenotypes are similar to those seen in cells with critically shortened telomeres. In this work, we created a mouse model of telomere ysfunction in the gastrointestinal tract through the conditional deletion of Pot1a that recapitulates the microscopic features seen in severe telomere attrition. Combined intestinal loss of Pot1a and p53 lead to formation of invasive adenocarcinomas in the small and large intestines. The tumors formed with long latency, low multiplicity and had complex genomes due to chromosomal instability, features similar to those seen in sporadic human colorectal cancers. Taken together, we have developed a novel mouse model of intestinal tumorigenesis based on genomic instability driven by telomere dysfunction.
Resumo:
The first manuscript, entitled "Time-Series Analysis as Input for Clinical Predictive Modeling: Modeling Cardiac Arrest in a Pediatric ICU" lays out the theoretical background for the project. There are several core concepts presented in this paper. First, traditional multivariate models (where each variable is represented by only one value) provide single point-in-time snapshots of patient status: they are incapable of characterizing deterioration. Since deterioration is consistently identified as a precursor to cardiac arrests, we maintain that the traditional multivariate paradigm is insufficient for predicting arrests. We identify time series analysis as a method capable of characterizing deterioration in an objective, mathematical fashion, and describe how to build a general foundation for predictive modeling using time series analysis results as latent variables. Building a solid foundation for any given modeling task involves addressing a number of issues during the design phase. These include selecting the proper candidate features on which to base the model, and selecting the most appropriate tool to measure them. We also identified several unique design issues that are introduced when time series data elements are added to the set of candidate features. One such issue is in defining the duration and resolution of time series elements required to sufficiently characterize the time series phenomena being considered as candidate features for the predictive model. Once the duration and resolution are established, there must also be explicit mathematical or statistical operations that produce the time series analysis result to be used as a latent candidate feature. In synthesizing the comprehensive framework for building a predictive model based on time series data elements, we identified at least four classes of data that can be used in the model design. The first two classes are shared with traditional multivariate models: multivariate data and clinical latent features. Multivariate data is represented by the standard one value per variable paradigm and is widely employed in a host of clinical models and tools. These are often represented by a number present in a given cell of a table. Clinical latent features derived, rather than directly measured, data elements that more accurately represent a particular clinical phenomenon than any of the directly measured data elements in isolation. The second two classes are unique to the time series data elements. The first of these is the raw data elements. These are represented by multiple values per variable, and constitute the measured observations that are typically available to end users when they review time series data. These are often represented as dots on a graph. The final class of data results from performing time series analysis. This class of data represents the fundamental concept on which our hypothesis is based. The specific statistical or mathematical operations are up to the modeler to determine, but we generally recommend that a variety of analyses be performed in order to maximize the likelihood that a representation of the time series data elements is produced that is able to distinguish between two or more classes of outcomes. The second manuscript, entitled "Building Clinical Prediction Models Using Time Series Data: Modeling Cardiac Arrest in a Pediatric ICU" provides a detailed description, start to finish, of the methods required to prepare the data, build, and validate a predictive model that uses the time series data elements determined in the first paper. One of the fundamental tenets of the second paper is that manual implementations of time series based models are unfeasible due to the relatively large number of data elements and the complexity of preprocessing that must occur before data can be presented to the model. Each of the seventeen steps is analyzed from the perspective of how it may be automated, when necessary. We identify the general objectives and available strategies of each of the steps, and we present our rationale for choosing a specific strategy for each step in the case of predicting cardiac arrest in a pediatric intensive care unit. Another issue brought to light by the second paper is that the individual steps required to use time series data for predictive modeling are more numerous and more complex than those used for modeling with traditional multivariate data. Even after complexities attributable to the design phase (addressed in our first paper) have been accounted for, the management and manipulation of the time series elements (the preprocessing steps in particular) are issues that are not present in a traditional multivariate modeling paradigm. In our methods, we present the issues that arise from the time series data elements: defining a reference time; imputing and reducing time series data in order to conform to a predefined structure that was specified during the design phase; and normalizing variable families rather than individual variable instances. The final manuscript, entitled: "Using Time-Series Analysis to Predict Cardiac Arrest in a Pediatric Intensive Care Unit" presents the results that were obtained by applying the theoretical construct and its associated methods (detailed in the first two papers) to the case of cardiac arrest prediction in a pediatric intensive care unit. Our results showed that utilizing the trend analysis from the time series data elements reduced the number of classification errors by 73%. The area under the Receiver Operating Characteristic curve increased from a baseline of 87% to 98% by including the trend analysis. In addition to the performance measures, we were also able to demonstrate that adding raw time series data elements without their associated trend analyses improved classification accuracy as compared to the baseline multivariate model, but diminished classification accuracy as compared to when just the trend analysis features were added (ie, without adding the raw time series data elements). We believe this phenomenon was largely attributable to overfitting, which is known to increase as the ratio of candidate features to class examples rises. Furthermore, although we employed several feature reduction strategies to counteract the overfitting problem, they failed to improve the performance beyond that which was achieved by exclusion of the raw time series elements. Finally, our data demonstrated that pulse oximetry and systolic blood pressure readings tend to start diminishing about 10-20 minutes before an arrest, whereas heart rates tend to diminish rapidly less than 5 minutes before an arrest.