Biblioteca Digital

16 resultados para Data-Driven Behavior Modeling

em DigitalCommons@The Texas Medical Center

Pathway Semantics: An Algebraic Data Driven Algorithm to Generate Hypotheses about Molecular Patterns Underlying Disease Progression

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The overarching goal of the Pathway Semantics Algorithm (PSA) is to improve the in silico identification of clinically useful hypotheses about molecular patterns in disease progression. By framing biomedical questions within a variety of matrix representations, PSA has the flexibility to analyze combined quantitative and qualitative data over a wide range of stratifications. The resulting hypothetical answers can then move to in vitro and in vivo verification, research assay optimization, clinical validation, and commercialization. Herein PSA is shown to generate novel hypotheses about the significant biological pathways in two disease domains: shock / trauma and hemophilia A, and validated experimentally in the latter. The PSA matrix algebra approach identified differential molecular patterns in biological networks over time and outcome that would not be easily found through direct assays, literature or database searches. In this dissertation, Chapter 1 provides a broad overview of the background and motivation for the study, followed by Chapter 2 with a literature review of relevant computational methods. Chapters 3 and 4 describe PSA for node and edge analysis respectively, and apply the method to disease progression in shock / trauma. Chapter 5 demonstrates the application of PSA to hemophilia A and the validation with experimental results. The work is summarized in Chapter 6, followed by extensive references and an Appendix with additional material.

The Open Door Mission: Measuring and predicting outcomes of one community-based substance abuse treatment program

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The objectives of this study were to identify and measure the average outcomes of the Open Door Mission's nine-month community-based substance abuse treatment program, identify predictors of successful outcomes, and make recommendations to the Open Door Mission for improving its treatment program.^ The Mission's program is exclusive to adult men who have limited financial resources: most of which were homeless or dependent on parents or other family members for basic living needs. Many, but not all, of these men are either chemically dependent or have a history of substance abuse.^ This study tracked a cohort of the Mission's graduates throughout this one-year study and identified various indicators of success at short-term intervals, which may be predictive of longer-term outcomes. We tracked various levels of 12-step program involvement, as well as other social and spiritual activities, such as church affiliation and recovery support.^ Twenty-four of the 66 subjects, or 36% met the Mission's requirements for success. Specific to this success criteria; Fifty-four, or 82% reported affiliation with a home church; Twenty-six, or 39% reported full-time employment; Sixty-one, or 92% did not report or were not identified as having any post-treatment arrests or incarceration, and; Forty, or 61% reported continuous abstinence from both drugs and alcohol.^ Five research-based hypotheses were developed and tested. The primary analysis tool was the web-based non-parametric dependency modeling tool, B-Course, which revealed some strong associations with certain variables, and helped the researchers generate and test several data-driven hypotheses. Full-time employment is the greatest predictor of abstinence: 95% of those who reported full time employment also reported continuous post-treatment abstinence, while 50% of those working part-time were abstinent and 29% of those with no employment were abstinent. Working with a 12-step sponsor, attending aftercare, and service with others were identified as predictors of abstinence.^ This study demonstrates that associations with abstinence and the ODM success criteria are not simply based on one social or behavioral factor. Rather, these relationships are interdependent, and show that abstinence is achieved and maintained through a combination of several 12-step recovery activities. This study used a simple assessment methodology, which demonstrated strong associations across variables and outcomes, which have practical applicability to the Open Door Mission for improving its treatment program. By leveraging the predictive capability of the various success determination methodologies discussed and developed throughout this study, we can identify accurate outcomes with both validity and reliability. This assessment instrument can also be used as an intervention that, if operationalized to the Mission’s clients during the primary treatment program, may measurably improve the effectiveness and outcomes of the Open Door Mission.^

TIME SERIES ANALYSIS AS INPUT FOR PREDICTIVE MODELING: PREDICTING CARDIAC ARREST IN A PEDIATRIC INTENSIVE CARE UNIT

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The first manuscript, entitled "Time-Series Analysis as Input for Clinical Predictive Modeling: Modeling Cardiac Arrest in a Pediatric ICU" lays out the theoretical background for the project. There are several core concepts presented in this paper. First, traditional multivariate models (where each variable is represented by only one value) provide single point-in-time snapshots of patient status: they are incapable of characterizing deterioration. Since deterioration is consistently identified as a precursor to cardiac arrests, we maintain that the traditional multivariate paradigm is insufficient for predicting arrests. We identify time series analysis as a method capable of characterizing deterioration in an objective, mathematical fashion, and describe how to build a general foundation for predictive modeling using time series analysis results as latent variables. Building a solid foundation for any given modeling task involves addressing a number of issues during the design phase. These include selecting the proper candidate features on which to base the model, and selecting the most appropriate tool to measure them. We also identified several unique design issues that are introduced when time series data elements are added to the set of candidate features. One such issue is in defining the duration and resolution of time series elements required to sufficiently characterize the time series phenomena being considered as candidate features for the predictive model. Once the duration and resolution are established, there must also be explicit mathematical or statistical operations that produce the time series analysis result to be used as a latent candidate feature. In synthesizing the comprehensive framework for building a predictive model based on time series data elements, we identified at least four classes of data that can be used in the model design. The first two classes are shared with traditional multivariate models: multivariate data and clinical latent features. Multivariate data is represented by the standard one value per variable paradigm and is widely employed in a host of clinical models and tools. These are often represented by a number present in a given cell of a table. Clinical latent features derived, rather than directly measured, data elements that more accurately represent a particular clinical phenomenon than any of the directly measured data elements in isolation. The second two classes are unique to the time series data elements. The first of these is the raw data elements. These are represented by multiple values per variable, and constitute the measured observations that are typically available to end users when they review time series data. These are often represented as dots on a graph. The final class of data results from performing time series analysis. This class of data represents the fundamental concept on which our hypothesis is based. The specific statistical or mathematical operations are up to the modeler to determine, but we generally recommend that a variety of analyses be performed in order to maximize the likelihood that a representation of the time series data elements is produced that is able to distinguish between two or more classes of outcomes. The second manuscript, entitled "Building Clinical Prediction Models Using Time Series Data: Modeling Cardiac Arrest in a Pediatric ICU" provides a detailed description, start to finish, of the methods required to prepare the data, build, and validate a predictive model that uses the time series data elements determined in the first paper. One of the fundamental tenets of the second paper is that manual implementations of time series based models are unfeasible due to the relatively large number of data elements and the complexity of preprocessing that must occur before data can be presented to the model. Each of the seventeen steps is analyzed from the perspective of how it may be automated, when necessary. We identify the general objectives and available strategies of each of the steps, and we present our rationale for choosing a specific strategy for each step in the case of predicting cardiac arrest in a pediatric intensive care unit. Another issue brought to light by the second paper is that the individual steps required to use time series data for predictive modeling are more numerous and more complex than those used for modeling with traditional multivariate data. Even after complexities attributable to the design phase (addressed in our first paper) have been accounted for, the management and manipulation of the time series elements (the preprocessing steps in particular) are issues that are not present in a traditional multivariate modeling paradigm. In our methods, we present the issues that arise from the time series data elements: defining a reference time; imputing and reducing time series data in order to conform to a predefined structure that was specified during the design phase; and normalizing variable families rather than individual variable instances. The final manuscript, entitled: "Using Time-Series Analysis to Predict Cardiac Arrest in a Pediatric Intensive Care Unit" presents the results that were obtained by applying the theoretical construct and its associated methods (detailed in the first two papers) to the case of cardiac arrest prediction in a pediatric intensive care unit. Our results showed that utilizing the trend analysis from the time series data elements reduced the number of classification errors by 73%. The area under the Receiver Operating Characteristic curve increased from a baseline of 87% to 98% by including the trend analysis. In addition to the performance measures, we were also able to demonstrate that adding raw time series data elements without their associated trend analyses improved classification accuracy as compared to the baseline multivariate model, but diminished classification accuracy as compared to when just the trend analysis features were added (ie, without adding the raw time series data elements). We believe this phenomenon was largely attributable to overfitting, which is known to increase as the ratio of candidate features to class examples rises. Furthermore, although we employed several feature reduction strategies to counteract the overfitting problem, they failed to improve the performance beyond that which was achieved by exclusion of the raw time series elements. Finally, our data demonstrated that pulse oximetry and systolic blood pressure readings tend to start diminishing about 10-20 minutes before an arrest, whereas heart rates tend to diminish rapidly less than 5 minutes before an arrest.

BAYESIAN STATISTICAL METHODS IN GENE-ENVIRONMENT AND GENE-GENE INTERACTION STUDIES

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Complex diseases such as cancer result from multiple genetic changes and environmental exposures. Due to the rapid development of genotyping and sequencing technologies, we are now able to more accurately assess causal effects of many genetic and environmental factors. Genome-wide association studies have been able to localize many causal genetic variants predisposing to certain diseases. However, these studies only explain a small portion of variations in the heritability of diseases. More advanced statistical models are urgently needed to identify and characterize some additional genetic and environmental factors and their interactions, which will enable us to better understand the causes of complex diseases. In the past decade, thanks to the increasing computational capabilities and novel statistical developments, Bayesian methods have been widely applied in the genetics/genomics researches and demonstrating superiority over some regular approaches in certain research areas. Gene-environment and gene-gene interaction studies are among the areas where Bayesian methods may fully exert its functionalities and advantages. This dissertation focuses on developing new Bayesian statistical methods for data analysis with complex gene-environment and gene-gene interactions, as well as extending some existing methods for gene-environment interactions to other related areas. It includes three sections: (1) Deriving the Bayesian variable selection framework for the hierarchical gene-environment and gene-gene interactions; (2) Developing the Bayesian Natural and Orthogonal Interaction (NOIA) models for gene-environment interactions; and (3) extending the applications of two Bayesian statistical methods which were developed for gene-environment interaction studies, to other related types of studies such as adaptive borrowing historical data. We propose a Bayesian hierarchical mixture model framework that allows us to investigate the genetic and environmental effects, gene by gene interactions (epistasis) and gene by environment interactions in the same model. It is well known that, in many practical situations, there exists a natural hierarchical structure between the main effects and interactions in the linear model. Here we propose a model that incorporates this hierarchical structure into the Bayesian mixture model, such that the irrelevant interaction effects can be removed more efficiently, resulting in more robust, parsimonious and powerful models. We evaluate both of the 'strong hierarchical' and 'weak hierarchical' models, which specify that both or one of the main effects between interacting factors must be present for the interactions to be included in the model. The extensive simulation results show that the proposed strong and weak hierarchical mixture models control the proportion of false positive discoveries and yield a powerful approach to identify the predisposing main effects and interactions in the studies with complex gene-environment and gene-gene interactions. We also compare these two models with the 'independent' model that does not impose this hierarchical constraint and observe their superior performances in most of the considered situations. The proposed models are implemented in the real data analysis of gene and environment interactions in the cases of lung cancer and cutaneous melanoma case-control studies. The Bayesian statistical models enjoy the properties of being allowed to incorporate useful prior information in the modeling process. Moreover, the Bayesian mixture model outperforms the multivariate logistic model in terms of the performances on the parameter estimation and variable selection in most cases. Our proposed models hold the hierarchical constraints, that further improve the Bayesian mixture model by reducing the proportion of false positive findings among the identified interactions and successfully identifying the reported associations. This is practically appealing for the study of investigating the causal factors from a moderate number of candidate genetic and environmental factors along with a relatively large number of interactions. The natural and orthogonal interaction (NOIA) models of genetic effects have previously been developed to provide an analysis framework, by which the estimates of effects for a quantitative trait are statistically orthogonal regardless of the existence of Hardy-Weinberg Equilibrium (HWE) within loci. Ma et al. (2012) recently developed a NOIA model for the gene-environment interaction studies and have shown the advantages of using the model for detecting the true main effects and interactions, compared with the usual functional model. In this project, we propose a novel Bayesian statistical model that combines the Bayesian hierarchical mixture model with the NOIA statistical model and the usual functional model. The proposed Bayesian NOIA model demonstrates more power at detecting the non-null effects with higher marginal posterior probabilities. Also, we review two Bayesian statistical models (Bayesian empirical shrinkage-type estimator and Bayesian model averaging), which were developed for the gene-environment interaction studies. Inspired by these Bayesian models, we develop two novel statistical methods that are able to handle the related problems such as borrowing data from historical studies. The proposed methods are analogous to the methods for the gene-environment interactions on behalf of the success on balancing the statistical efficiency and bias in a unified model. By extensive simulation studies, we compare the operating characteristics of the proposed models with the existing models including the hierarchical meta-analysis model. The results show that the proposed approaches adaptively borrow the historical data in a data-driven way. These novel models may have a broad range of statistical applications in both of genetic/genomic and clinical studies.

CHARACTERIZING, ASSESSING AND IMPROVING HEALTHCARE REFERRAL COMMUNICATION

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Similar to other health care processes, referrals are susceptible to breakdowns. These breakdowns in the referral process can lead to poor continuity of care, slow diagnostic processes, delays and repetition of tests, patient and provider dissatisfaction, and can lead to a loss of confidence in providers. These facts and the necessity for a deeper understanding of referrals in healthcare served as the motivation to conduct a comprehensive study of referrals. The research began with the real problem and need to understand referral communication as a mean to improve patient care. Despite previous efforts to explain referrals and the dynamics and interrelations of the variables that influence referrals there is not a common, contemporary, and accepted definition of what a referral is in the health care context. The research agenda was guided by the need to explore referrals as an abstract concept by: 1) developing a conceptual definition of referrals, and 2) developing a model of referrals, to finally propose a 3) comprehensive research framework. This dissertation has resulted in a standard conceptual definition of referrals and a model of referrals. In addition a mixed-method framework to evaluate referrals was proposed, and finally a data driven model was developed to predict whether a referral would be approved or denied by a specialty service. The three manuscripts included in this dissertation present the basis for studying and assessing referrals using a common framework that should allow an easier comparative research agenda to improve referrals taking into account the context where referrals occur.

The Comal County Needs Assessment Youth Survey

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A crucial link in preserving and protecting the future of our communities resides in maintaining the health and well being of our youth. While every member of the community owns an opinion regarding where to best utilize monies for prevention and intervention, the data to support such opinion is often scarce. In an effort to generate data-driven indices for community planning and action, the United Way of Comal County, Texas partnered with the University Of Texas - Houston Health Science Center, School Of Public Health to accomplish a county-specific needs assessment. A community-based participatory research emphasis utilizing the Mobilization for Action through Planning and Partnership (MAPP) format developed by the National Association of City and County Health Officials (NACCHO) was implemented to engage community members in identifying and addressing community priorities. The single greatest area of consensus and concern identified by community members was the health and well being of the youth population. Thus, a youth survey, targeting these specific areas of community concern, was designed, coordinated and administered to all 9-11th grade students in the county. 20% of the 3,698 completed surveys (72% response rate) were randomly selected for analysis. These 740 surveys were coded and scanned into an electronic survey database. Statistical analysis provided youth-reported data on the status of the multiple issues affecting the health and well being of the community's youth. These data will be reported back to the community stakeholders, as part of the larger Comal County Needs Assessment, for the purposes of community planning and action. Survey data will provide community planners with an awareness of the high risk behaviors and habit patterns amongst their youth. This knowledge will permit more effective targeting of the means for encouraging healthy behaviors and preventing the spread of disease. Further, the community-oriented, population-based nature of this effort will provide answers to questions raised by the community and will provide an effective launching pad for the development and implementation of targeted, preventive health strategies. ^

Respiratory rhythm generation: A modeling study of the respiratory central pattern generator and the phrenic motor neuron

Relevância:

50.00% 50.00%

Publicador:

Resumo:

The respiratory central pattern generator is a collection of medullary neurons that generates the rhythm of respiration. The respiratory central pattern generator feeds phrenic motor neurons, which, in turn, drive the main muscle of respiration, the diaphragm. The purpose of this thesis is to understand the neural control of respiration through mathematical models of the respiratory central pattern generator and phrenic motor neurons. ^ We first designed and validated a Hodgkin-Huxley type model that mimics the behavior of phrenic motor neurons under a wide range of electrical and pharmacological perturbations. This model was constrained physiological data from the literature. Next, we designed and validated a model of the respiratory central pattern generator by connecting four Hodgkin-Huxley type models of medullary respiratory neurons in a mutually inhibitory network. This network was in turn driven by a simple model of an endogenously bursting neuron, which acted as the pacemaker for the respiratory central pattern generator. Finally, the respiratory central pattern generator and phrenic motor neuron models were connected and their interactions studied. ^ Our study of the models has provided a number of insights into the behavior of the respiratory central pattern generator and phrenic motor neurons. These include the suggestion of a role for the T-type and N-type calcium channels during single spikes and repetitive firing in phrenic motor neurons, as well as a better understanding of network properties underlying respiratory rhythm generation. We also utilized an existing model of lung mechanics to study the interactions between the respiratory central pattern generator and ventilation. ^

Predictive oncology: a review of multidisciplinary, multiscale in silico modeling linking phenotype, morphology and growth.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Empirical evidence and theoretical studies suggest that the phenotype, i.e., cellular- and molecular-scale dynamics, including proliferation rate and adhesiveness due to microenvironmental factors and gene expression that govern tumor growth and invasiveness, also determine gross tumor-scale morphology. It has been difficult to quantify the relative effect of these links on disease progression and prognosis using conventional clinical and experimental methods and observables. As a result, successful individualized treatment of highly malignant and invasive cancers, such as glioblastoma, via surgical resection and chemotherapy cannot be offered and outcomes are generally poor. What is needed is a deterministic, quantifiable method to enable understanding of the connections between phenotype and tumor morphology. Here, we critically assess advantages and disadvantages of recent computational modeling efforts (e.g., continuum, discrete, and cellular automata models) that have pursued this understanding. Based on this assessment, we review a multiscale, i.e., from the molecular to the gross tumor scale, mathematical and computational "first-principle" approach based on mass conservation and other physical laws, such as employed in reaction-diffusion systems. Model variables describe known characteristics of tumor behavior, and parameters and functional relationships across scales are informed from in vitro, in vivo and ex vivo biology. We review the feasibility of this methodology that, once coupled to tumor imaging and tumor biopsy or cell culture data, should enable prediction of tumor growth and therapy outcome through quantification of the relation between the underlying dynamics and morphological characteristics. In particular, morphologic stability analysis of this mathematical model reveals that tumor cell patterning at the tumor-host interface is regulated by cell proliferation, adhesion and other phenotypic characteristics: histopathology information of tumor boundary can be inputted to the mathematical model and used as a phenotype-diagnostic tool to predict collective and individual tumor cell invasion of surrounding tissue. This approach further provides a means to deterministically test effects of novel and hypothetical therapy strategies on tumor behavior.

MODELING SPORADIC TUMOR FORMATION DRIVEN BY TELOMERE DYSFUNCTION IN THE GASTROINTESTINAL TRACT

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Colorectal cancer is a complex disease that is thought to arise when cells accumulate mutations that allow for uncontrolled growth. There are several recognized mechanisms for generating such mutations in sporadic colon cancer; one of which is chromosomal instability (CIN). One hypothesized driver of CIN in cancer is the improper repair of dysfunctional telomeres. Telomeres comprise the linear ends of chromosomes and play a dual role in cancer. Its length is maintained by the ribonucleoprotein, telomerase, which is not a normally expressed in somatic cells and as cells divide, telomeres continuously shorten. Critically shortened telomeres are considered dysfunctional as they are recognized as sites of DNA damage and cells respond by entering into replicative senescence or apoptosis, a process that is p53-dependent and the mechanism for telomere-induced tumor suppression. Loss of this checkpoint and improper repair of dysfunctional telomeres can initiate a cycle of fusion, bridge and breakage that can lead to chromosomal changes and genomic instability, a process that can lead to transformation of normal cells to cancer cells. Mouse models of telomere dysfunction are currently based on knocking out the telomerase protein or RNA component; however, the naturally long telomeres of mice require multiple generational crosses of telomerase null mice to achieve critically short telomeres. Shelterin is a complex of six core proteins that bind to telomeres specifically. Pot1a is a highly conserved member of this complex that specifically binds to the telomeric single-stranded 3’ G-rich overhang. Previous work in our lab has shown that Pot1a is essential for chromosomal end protection as deletion of Pot1a in murine embryonic fibroblasts (MEFs) leads to open telomere ends that initiate a DNA damage response mediated by ATR, resulting in p53-dependent cellular senescence. Loss of Pot1a in the background of p53 deficiency results in increased aberrant homologous recombination at telomeres and elevated genomic instability, which allows Pot1a-/-, p53-/- MEFs to form tumors when injected into SCID mice. These phenotypes are similar to those seen in cells with critically shortened telomeres. In this work, we created a mouse model of telomere ysfunction in the gastrointestinal tract through the conditional deletion of Pot1a that recapitulates the microscopic features seen in severe telomere attrition. Combined intestinal loss of Pot1a and p53 lead to formation of invasive adenocarcinomas in the small and large intestines. The tumors formed with long latency, low multiplicity and had complex genomes due to chromosomal instability, features similar to those seen in sporadic human colorectal cancers. Taken together, we have developed a novel mouse model of intestinal tumorigenesis based on genomic instability driven by telomere dysfunction.

Adolescent Sexual Behavior: Examining Data from Texas and the US

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Background: The US has higher rates of teen births and sexually transmitted infections (STI) than other developed countries. Texas youth are disproportionately impacted. Purpose: To review local, state, and national data on teens’ engagement in sexual risk behaviors to inform policy and practice related to teen sexual health. Methods: 2009 middle school and high school Youth Risk Behavior Survey (YRBS) data, and data from All About Youth, a middle school study conducted in a large urban school district in Texas, were analyzed to assess the prevalence of sexual initiation, including the initiation of non-coital sex, and the prevalence of sexual risk behaviors among Texas and US youth. Results: A substantial proportion of middle and high school students are having sex. Sexual initiation begins as early as 6th grade and increases steadily through 12th grade with almost two-thirds of high school seniors being sexually experienced. Many teens are not protecting themselves from unintended pregnancy or STIs – nationally, 80% and 39% of high school students did not use birth control pills or a condom respectively the last time they had sex. Many middle and high school students are engaging in oral and anal sex, two behaviors which increase the risk of contracting an STI and HIV. In Texas, an estimated 689,512 out of 1,327,815 public high school students are sexually experienced – over half (52%) of the total high school population. Texas students surpass their US peers in several sexual risk behaviors including number of lifetime sexual partners, being currently sexually active, and not using effective methods of birth control or dual protection when having sex. They are also less likely to receive HIV/AIDS education in school. Conclusion: Changes in policy and practice, including implementation of evidence-based sex education programs in middle and high schools and increased access to integrated, teen-friendly sexual and reproductive health services, are urgently needed at the state and national levels to address these issues effectively.

A LAPLACE TRANSFORM PAIR MODEL TO DETERMINE BREMSSTRAHLUNG SPECTRA FROM ATTENUATION DATA (RADIATION, X-RAY, SPECTRAL MODELING)

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Every x-ray attenuation curve inherently contains all the information necessary to extract the complete energy spectrum of a beam. To date, attempts to obtain accurate spectral information from attenuation data have been inadequate.^ This investigation presents a mathematical pair model, grounded in physical reality by the Laplace Transformation, to describe the attenuation of a photon beam and the corresponding bremsstrahlung spectral distribution. In addition the Laplace model has been mathematically extended to include characteristic radiation in a physically meaningful way. A method to determine the fraction of characteristic radiation in any diagnostic x-ray beam was introduced for use with the extended model.^ This work has examined the reconstructive capability of the Laplace pair model for a photon beam range of from 50 kVp to 25 MV, using both theoretical and experimental methods.^ In the diagnostic region, excellent agreement between a wide variety of experimental spectra and those reconstructed with the Laplace model was obtained when the atomic composition of the attenuators was accurately known. The model successfully reproduced a 2 MV spectrum but demonstrated difficulty in accurately reconstructing orthovoltage and 6 MV spectra. The 25 MV spectrum was successfully reconstructed although poor agreement with the spectrum obtained by Levy was found.^ The analysis of errors, performed with diagnostic energy data, demonstrated the relative insensitivity of the model to typical experimental errors and confirmed that the model can be successfully used to theoretically derive accurate spectral information from experimental attenuation data. ^

Bayesian joint modeling of longitudinal and survival data

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The joint modeling of longitudinal and survival data is a new approach to many applications such as HIV, cancer vaccine trials and quality of life studies. There are recent developments of the methodologies with respect to each of the components of the joint model as well as statistical processes that link them together. Among these, second order polynomial random effect models and linear mixed effects models are the most commonly used for the longitudinal trajectory function. In this study, we first relax the parametric constraints for polynomial random effect models by using Dirichlet process priors, then three longitudinal markers rather than only one marker are considered in one joint model. Second, we use a linear mixed effect model for the longitudinal process in a joint model analyzing the three markers. In this research these methods were applied to the Primary Biliary Cirrhosis sequential data, which were collected from a clinical trial of primary biliary cirrhosis (PBC) of the liver. This trial was conducted between 1974 and 1984 at the Mayo Clinic. The effects of three longitudinal markers (1) Total Serum Bilirubin, (2) Serum Albumin and (3) Serum Glutamic-Oxaloacetic transaminase (SGOT) on patients' survival were investigated. Proportion of treatment effect will also be studied using the proposed joint modeling approaches. ^ Based on the results, we conclude that the proposed modeling approaches yield better fit to the data and give less biased parameter estimates for these trajectory functions than previous methods. Model fit is also improved after considering three longitudinal markers instead of one marker only. The results from analysis of proportion of treatment effects from these joint models indicate same conclusion as that from the final model of Fleming and Harrington (1991), which is Bilirubin and Albumin together has stronger impact in predicting patients' survival and as a surrogate endpoints for treatment. ^

Integrating sequence information in microarray data analysis by free energy modeling

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Microarray technology is a high-throughput method for genotyping and gene expression profiling. Limited sensitivity and specificity are one of the essential problems for this technology. Most of existing methods of microarray data analysis have an apparent limitation for they merely deal with the numerical part of microarray data and have made little use of gene sequence information. Because it's the gene sequences that precisely define the physical objects being measured by a microarray, it is natural to make the gene sequences an essential part of the data analysis. This dissertation focused on the development of free energy models to integrate sequence information in microarray data analysis. The models were used to characterize the mechanism of hybridization on microarrays and enhance sensitivity and specificity of microarray measurements. ^ Cross-hybridization is a major obstacle factor for the sensitivity and specificity of microarray measurements. In this dissertation, we evaluated the scope of cross-hybridization problem on short-oligo microarrays. The results showed that cross hybridization on arrays is mostly caused by oligo fragments with a run of 10 to 16 nucleotides complementary to the probes. Furthermore, a free-energy based model was proposed to quantify the amount of cross-hybridization signal on each probe. This model treats cross-hybridization as an integral effect of the interactions between a probe and various off-target oligo fragments. Using public spike-in datasets, the model showed high accuracy in predicting the cross-hybridization signals on those probes whose intended targets are absent in the sample. ^ Several prospective models were proposed to improve Positional Dependent Nearest-Neighbor (PDNN) model for better quantification of gene expression and cross-hybridization. ^ The problem addressed in this dissertation is fundamental to the microarray technology. We expect that this study will help us to understand the detailed mechanism that determines sensitivity and specificity on the microarrays. Consequently, this research will have a wide impact on how microarrays are designed and how the data are interpreted. ^

Modeling the survival of T-cell lymphocytes using compound branching processes

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Radiotherapy has been a method of choice in cancer treatment for a number of years. Mathematical modeling is an important tool in studying the survival behavior of any cell as well as its radiosensitivity. One particular cell under investigation is the normal T-cell, the radiosensitivity of which may be indicative to the patient's tolerance to radiation doses.^ The model derived is a compound branching process with a random initial population of T-cells that is assumed to have compound distribution. T-cells in any generation are assumed to double or die at random lengths of time. This population is assumed to undergo a random number of generations within a period of time. The model is then used to obtain an estimate for the survival probability of T-cells for the data under investigation. This estimate is derived iteratively by applying the likelihood principle. Further assessment of the validity of the model is performed by simulating a number of subjects under this model.^ This study shows that there is a great deal of variation in T-cells survival from one individual to another. These variations can be observed under normal conditions as well as under radiotherapy. The findings are in agreement with a recent study and show that genetic diversity plays a role in determining the survival of T-cells. ^

Mixture modeling for joint analysis of survival, discrete, and continuous data

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Mixture modeling is commonly used to model categorical latent variables that represent subpopulations in which population membership is unknown but can be inferred from the data. In relatively recent years, the potential of finite mixture models has been applied in time-to-event data. However, the commonly used survival mixture model assumes that the effects of the covariates involved in failure times differ across latent classes, but the covariate distribution is homogeneous. The aim of this dissertation is to develop a method to examine time-to-event data in the presence of unobserved heterogeneity under a framework of mixture modeling. A joint model is developed to incorporate the latent survival trajectory along with the observed information for the joint analysis of a time-to-event variable, its discrete and continuous covariates, and a latent class variable. It is assumed that the effects of covariates on survival times and the distribution of covariates vary across different latent classes. The unobservable survival trajectories are identified through estimating the probability that a subject belongs to a particular class based on observed information. We applied this method to a Hodgkin lymphoma study with long-term follow-up and observed four distinct latent classes in terms of long-term survival and distributions of prognostic factors. Our results from simulation studies and from the Hodgkin lymphoma study demonstrated the superiority of our joint model compared with the conventional survival model. This flexible inference method provides more accurate estimation and accommodates unobservable heterogeneity among individuals while taking involved interactions between covariates into consideration.^

«
1
2
»