11 resultados para markov chain model
em DigitalCommons@The Texas Medical Center
Resumo:
In this dissertation, we propose a continuous-time Markov chain model to examine the longitudinal data that have three categories in the outcome variable. The advantage of this model is that it permits a different number of measurements for each subject and the duration between two consecutive time points of measurements can be irregular. Using the maximum likelihood principle, we can estimate the transition probability between two time points. By using the information provided by the independent variables, this model can also estimate the transition probability for each subject. The Monte Carlo simulation method will be used to investigate the goodness of model fitting compared with that obtained from other models. A public health example will be used to demonstrate the application of this method. ^
Resumo:
This paper reports a comparison of three modeling strategies for the analysis of hospital mortality in a sample of general medicine inpatients in a Department of Veterans Affairs medical center. Logistic regression, a Markov chain model, and longitudinal logistic regression were evaluated on predictive performance as measured by the c-index and on accuracy of expected numbers of deaths compared to observed. The logistic regression used patient information collected at admission; the Markov model was comprised of two absorbing states for discharge and death and three transient states reflecting increasing severity of illness as measured by laboratory data collected during the hospital stay; longitudinal regression employed Generalized Estimating Equations (GEE) to model covariance structure for the repeated binary outcome. Results showed that the logistic regression predicted hospital mortality as well as the alternative methods but was limited in scope of application. The Markov chain provides insights into how day to day changes of illness severity lead to discharge or death. The longitudinal logistic regression showed that increasing illness trajectory is associated with hospital mortality. The conclusion is reached that for standard applications in modeling hospital mortality, logistic regression is adequate, but for new challenges facing health services research today, alternative methods are equally predictive, practical, and can provide new insights. ^
Resumo:
This study investigates a theoretical model where a longitudinal process, that is a stationary Markov-Chain, and a Weibull survival process share a bivariate random effect. Furthermore, a Quality-of-Life adjusted survival is calculated as the weighted sum of survival time. Theoretical values of population mean adjusted survival of the described model are computed numerically. The parameters of the bivariate random effect do significantly affect theoretical values of population mean. Maximum-Likelihood and Bayesian methods are applied on simulated data to estimate the model parameters. Based on the parameter estimates, predicated population mean adjusted survival can then be calculated numerically and compared with the theoretical values. Bayesian method and Maximum-Likelihood method provide parameter estimations and population mean prediction with comparable accuracy; however Bayesian method suffers from poor convergence due to autocorrelation and inter-variable correlation. ^
Resumo:
The discrete-time Markov chain is commonly used in describing changes of health states for chronic diseases in a longitudinal study. Statistical inferences on comparing treatment effects or on finding determinants of disease progression usually require estimation of transition probabilities. In many situations when the outcome data have some missing observations or the variable of interest (called a latent variable) can not be measured directly, the estimation of transition probabilities becomes more complicated. In the latter case, a surrogate variable that is easier to access and can gauge the characteristics of the latent one is usually used for data analysis. ^ This dissertation research proposes methods to analyze longitudinal data (1) that have categorical outcome with missing observations or (2) that use complete or incomplete surrogate observations to analyze the categorical latent outcome. For (1), different missing mechanisms were considered for empirical studies using methods that include EM algorithm, Monte Carlo EM and a procedure that is not a data augmentation method. For (2), the hidden Markov model with the forward-backward procedure was applied for parameter estimation. This method was also extended to cover the computation of standard errors. The proposed methods were demonstrated by the Schizophrenia example. The relevance of public health, the strength and limitations, and possible future research were also discussed. ^
Resumo:
A multivariate frailty hazard model is developed for joint-modeling of three correlated time-to-event outcomes: (1) local recurrence, (2) distant recurrence, and (3) overall survival. The term frailty is introduced to model population heterogeneity. The dependence is modeled by conditioning on a shared frailty that is included in the three hazard functions. Independent variables can be included in the model as covariates. The Markov chain Monte Carlo methods are used to estimate the posterior distributions of model parameters. The algorithm used in present application is the hybrid Metropolis-Hastings algorithm, which simultaneously updates all parameters with evaluations of gradient of log posterior density. The performance of this approach is examined based on simulation studies using Exponential and Weibull distributions. We apply the proposed methods to a study of patients with soft tissue sarcoma, which motivated this research. Our results indicate that patients with chemotherapy had better overall survival with hazard ratio of 0.242 (95% CI: 0.094 - 0.564) and lower risk of distant recurrence with hazard ratio of 0.636 (95% CI: 0.487 - 0.860), but not significantly better in local recurrence with hazard ratio of 0.799 (95% CI: 0.575 - 1.054). The advantages and limitations of the proposed models, and future research directions are discussed. ^
Resumo:
Breast cancer is the most common non-skin cancer and the second leading cause of cancer-related death in women in the United States. Studies on ipsilateral breast tumor relapse (IBTR) status and disease-specific survival will help guide clinic treatment and predict patient prognosis.^ After breast conservation therapy, patients with breast cancer may experience breast tumor relapse. This relapse is classified into two distinct types: true local recurrence (TR) and new ipsilateral primary tumor (NP). However, the methods used to classify the relapse types are imperfect and are prone to misclassification. In addition, some observed survival data (e.g., time to relapse and time from relapse to death)are strongly correlated with relapse types. The first part of this dissertation presents a Bayesian approach to (1) modeling the potentially misclassified relapse status and the correlated survival information, (2) estimating the sensitivity and specificity of the diagnostic methods, and (3) quantify the covariate effects on event probabilities. A shared frailty was used to account for the within-subject correlation between survival times. The inference was conducted using a Bayesian framework via Markov Chain Monte Carlo simulation implemented in softwareWinBUGS. Simulation was used to validate the Bayesian method and assess its frequentist properties. The new model has two important innovations: (1) it utilizes the additional survival times correlated with the relapse status to improve the parameter estimation, and (2) it provides tools to address the correlation between the two diagnostic methods conditional to the true relapse types.^ Prediction of patients at highest risk for IBTR after local excision of ductal carcinoma in situ (DCIS) remains a clinical concern. The goals of the second part of this dissertation were to evaluate a published nomogram from Memorial Sloan-Kettering Cancer Center, to determine the risk of IBTR in patients with DCIS treated with local excision, and to determine whether there is a subset of patients at low risk of IBTR. Patients who had undergone local excision from 1990 through 2007 at MD Anderson Cancer Center with a final diagnosis of DCIS (n=794) were included in this part. Clinicopathologic factors and the performance of the Memorial Sloan-Kettering Cancer Center nomogram for prediction of IBTR were assessed for 734 patients with complete data. Nomogram for prediction of 5- and 10-year IBTR probabilities were found to demonstrate imperfect calibration and discrimination, with an area under the receiver operating characteristic curve of .63 and a concordance index of .63. In conclusion, predictive models for IBTR in DCIS patients treated with local excision are imperfect. Our current ability to accurately predict recurrence based on clinical parameters is limited.^ The American Joint Committee on Cancer (AJCC) staging of breast cancer is widely used to determine prognosis, yet survival within each AJCC stage shows wide variation and remains unpredictable. For the third part of this dissertation, biologic markers were hypothesized to be responsible for some of this variation, and the addition of biologic markers to current AJCC staging were examined for possibly provide improved prognostication. The initial cohort included patients treated with surgery as first intervention at MDACC from 1997 to 2006. Cox proportional hazards models were used to create prognostic scoring systems. AJCC pathologic staging parameters and biologic tumor markers were investigated to devise the scoring systems. Surveillance Epidemiology and End Results (SEER) data was used as the external cohort to validate the scoring systems. Binary indicators for pathologic stage (PS), estrogen receptor status (E), and tumor grade (G) were summed to create PS+EG scoring systems devised to predict 5-year patient outcomes. These scoring systems facilitated separation of the study population into more refined subgroups than the current AJCC staging system. The ability of the PS+EG score to stratify outcomes was confirmed in both internal and external validation cohorts. The current study proposes and validates a new staging system by incorporating tumor grade and ER status into current AJCC staging. We recommend that biologic markers be incorporating into revised versions of the AJCC staging system for patients receiving surgery as the first intervention.^ Chapter 1 focuses on developing a Bayesian method to solve misclassified relapse status and application to breast cancer data. Chapter 2 focuses on evaluation of a breast cancer nomogram for predicting risk of IBTR in patients with DCIS after local excision gives the statement of the problem in the clinical research. Chapter 3 focuses on validation of a novel staging system for disease-specific survival in patients with breast cancer treated with surgery as the first intervention. ^
Resumo:
The tobacco-specific nitrosamine 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone (NNK) is an obvious carcinogen for lung cancer. Since CBMN (Cytokinesis-blocked micronucleus) has been found to be extremely sensitive to NNK-induced genetic damage, it is a potential important factor to predict the lung cancer risk. However, the association between lung cancer and NNK-induced genetic damage measured by CBMN assay has not been rigorously examined. ^ This research develops a methodology to model the chromosomal changes under NNK-induced genetic damage in a logistic regression framework in order to predict the occurrence of lung cancer. Since these chromosomal changes were usually not observed very long due to laboratory cost and time, a resampling technique was applied to generate the Markov chain of the normal and the damaged cell for each individual. A joint likelihood between the resampled Markov chains and the logistic regression model including transition probabilities of this chain as covariates was established. The Maximum likelihood estimation was applied to carry on the statistical test for comparison. The ability of this approach to increase discriminating power to predict lung cancer was compared to a baseline "non-genetic" model. ^ Our method offered an option to understand the association between the dynamic cell information and lung cancer. Our study indicated the extent of DNA damage/non-damage using the CBMN assay provides critical information that impacts public health studies of lung cancer risk. This novel statistical method could simultaneously estimate the process of DNA damage/non-damage and its relationship with lung cancer for each individual.^
Resumo:
Complete NotI, SfiI, XbaI and BlnI cleavage maps of Escherichia coli K-12 strain MG1655 were constructed. Techniques used included: CHEF pulsed field gel electrophoresis; transposon mutagenesis; fragment hybridization to the ordered $\lambda$ library of Kohara et al.; fragment and cosmid hybridization to Southern blots; correlation of fragments and cleavage sites with EcoMap, a sequence-modified version of the genomic restriction map of Kohara et al.; and correlation of cleavage sites with DNA sequence databases. In all, 105 restriction sites were mapped and correlated with the EcoMap coordinate system.^ NotI, SfiI, XbaI and BlnI restriction patterns of five commonly used E. coli K-12 strains were compared to those of MG1655. The variability between strains, some of which are separated by numerous steps of mutagenic treatment, is readily detectable by pulsed-field gel electrophoresis. A model is presented to account for the difference between the strains on the basis of simple insertions, deletions, and in one case an inversion. Insertions and deletions ranged in size from 1 kb to 86 kb. Several of the larger features have previously been characterized and some of the smaller rearrangements can potentially account for previously reported genetic features of these strains.^ Some aspects of the frequency and distribution of NotI, SfiI, XbaI and BlnI cleavage sites were analyzed using a method based on Markov chain theory. Overlaps of Dam and Dcm methylase sites with XbaI and SfiI cleavage sites were examined. The one XbaI-Dam overlap in the database is in accord with the expected frequency of this overlap. The occurrence of certain types of SfiI-Dcm overlaps are overrepresented. Of the four subtypes of SfiI-Dcm overlap, only one has a partial inhibitory effect on the activity of SfiI. Recognition sites for all four enzymes are rarer than expected based on oligonucleotide frequency data, with this effect being much stronger for XbaI and BlnI than for NotI and SfiI. The latter two enzyme sites are rare mainly due to apparent negative selection against GGCC (both) and CGGCCG (NotI). The former two enzyme sites are rare mainly due to effects of the VSP repair system on certain di-tri- and tetranucleotides, most notably CTAG. Models are proposed to explain several of the anomalies of oligonucleotide distribution in E. coli, and the biological significance of the systems that produce these anomalies is discussed. ^
Resumo:
The application of Markov processes is very useful to health-care problems. The objective of this study is to provide a structured methodology of forecasting cost based upon combining a stochastic model of utilization (Markov Chain) and deterministic cost function. The perspective of the cost in this study is the reimbursement for the services rendered. The data to be used is the OneCare database of claim records of their enrollees over a two-year period of January 1, 1996–December 31, 1997. The model combines a Markov Chain that describes the utilization pattern and its variability where the use of resources by risk groups (age, gender, and diagnosis) will be considered in the process and a cost function determined from a fixed schedule based on real costs or charges for those in the OneCare claims database. The cost function is a secondary application to the model. Goodness-of-fit will be used checked for the model against the traditional method of cost forecasting. ^
Resumo:
With the recognition of the importance of evidence-based medicine, there is an emerging need for methods to systematically synthesize available data. Specifically, methods to provide accurate estimates of test characteristics for diagnostic tests are needed to help physicians make better clinical decisions. To provide more flexible approaches for meta-analysis of diagnostic tests, we developed three Bayesian generalized linear models. Two of these models, a bivariate normal and a binomial model, analyzed pairs of sensitivity and specificity values while incorporating the correlation between these two outcome variables. Noninformative independent uniform priors were used for the variance of sensitivity, specificity and correlation. We also applied an inverse Wishart prior to check the sensitivity of the results. The third model was a multinomial model where the test results were modeled as multinomial random variables. All three models can include specific imaging techniques as covariates in order to compare performance. Vague normal priors were assigned to the coefficients of the covariates. The computations were carried out using the 'Bayesian inference using Gibbs sampling' implementation of Markov chain Monte Carlo techniques. We investigated the properties of the three proposed models through extensive simulation studies. We also applied these models to a previously published meta-analysis dataset on cervical cancer as well as to an unpublished melanoma dataset. In general, our findings show that the point estimates of sensitivity and specificity were consistent among Bayesian and frequentist bivariate normal and binomial models. However, in the simulation studies, the estimates of the correlation coefficient from Bayesian bivariate models are not as good as those obtained from frequentist estimation regardless of which prior distribution was used for the covariance matrix. The Bayesian multinomial model consistently underestimated the sensitivity and specificity regardless of the sample size and correlation coefficient. In conclusion, the Bayesian bivariate binomial model provides the most flexible framework for future applications because of its following strengths: (1) it facilitates direct comparison between different tests; (2) it captures the variability in both sensitivity and specificity simultaneously as well as the intercorrelation between the two; and (3) it can be directly applied to sparse data without ad hoc correction. ^
Resumo:
In geographical epidemiology, maps of disease rates and disease risk provide a spatial perspective for researching disease etiology. For rare diseases or when the population base is small, the rate and risk estimates may be unstable. Empirical Bayesian (EB) methods have been used to spatially smooth the estimates by permitting an area estimate to "borrow strength" from its neighbors. Such EB methods include the use of a Gamma model, of a James-Stein estimator, and of a conditional autoregressive (CAR) process. A fully Bayesian analysis of the CAR process is proposed. One advantage of this fully Bayesian analysis is that it can be implemented simply by using repeated sampling from the posterior densities. Use of a Markov chain Monte Carlo technique such as Gibbs sampler was not necessary. Direct resampling from the posterior densities provides exact small sample inferences instead of the approximate asymptotic analyses of maximum likelihood methods (Clayton & Kaldor, 1987). Further, the proposed CAR model provides for covariates to be included in the model. A simulation demonstrates the effect of sample size on the fully Bayesian analysis of the CAR process. The methods are applied to lip cancer data from Scotland, and the results are compared. ^