7 resultados para random effect
em DigitalCommons@The Texas Medical Center
Resumo:
The joint modeling of longitudinal and survival data is a new approach to many applications such as HIV, cancer vaccine trials and quality of life studies. There are recent developments of the methodologies with respect to each of the components of the joint model as well as statistical processes that link them together. Among these, second order polynomial random effect models and linear mixed effects models are the most commonly used for the longitudinal trajectory function. In this study, we first relax the parametric constraints for polynomial random effect models by using Dirichlet process priors, then three longitudinal markers rather than only one marker are considered in one joint model. Second, we use a linear mixed effect model for the longitudinal process in a joint model analyzing the three markers. In this research these methods were applied to the Primary Biliary Cirrhosis sequential data, which were collected from a clinical trial of primary biliary cirrhosis (PBC) of the liver. This trial was conducted between 1974 and 1984 at the Mayo Clinic. The effects of three longitudinal markers (1) Total Serum Bilirubin, (2) Serum Albumin and (3) Serum Glutamic-Oxaloacetic transaminase (SGOT) on patients' survival were investigated. Proportion of treatment effect will also be studied using the proposed joint modeling approaches. ^ Based on the results, we conclude that the proposed modeling approaches yield better fit to the data and give less biased parameter estimates for these trajectory functions than previous methods. Model fit is also improved after considering three longitudinal markers instead of one marker only. The results from analysis of proportion of treatment effects from these joint models indicate same conclusion as that from the final model of Fleming and Harrington (1991), which is Bilirubin and Albumin together has stronger impact in predicting patients' survival and as a surrogate endpoints for treatment. ^
Resumo:
This study investigates a theoretical model where a longitudinal process, that is a stationary Markov-Chain, and a Weibull survival process share a bivariate random effect. Furthermore, a Quality-of-Life adjusted survival is calculated as the weighted sum of survival time. Theoretical values of population mean adjusted survival of the described model are computed numerically. The parameters of the bivariate random effect do significantly affect theoretical values of population mean. Maximum-Likelihood and Bayesian methods are applied on simulated data to estimate the model parameters. Based on the parameter estimates, predicated population mean adjusted survival can then be calculated numerically and compared with the theoretical values. Bayesian method and Maximum-Likelihood method provide parameter estimations and population mean prediction with comparable accuracy; however Bayesian method suffers from poor convergence due to autocorrelation and inter-variable correlation. ^
Resumo:
Prostate cancer (CaP) is the most diagnosed non-cutaneous malignancy and the second leading cause of cancer mortality among United States males. Major racial disparities in incidence, survival, as well as treatment persist. The mortality is three times higher among African Americans (AAs) compared with Caucasians. Androgen carcinogenesis has been persistently implicated but results are inconsistent; and hormone manipulation has been the main stay of treatment for metastatic disease, supportive of the androgen carcinogenesis. The survival disadvantage of AAs has been attributed to the differences in socioeconomic factors (SES), tumor stage, and treatment. We hypostasized that HT prolongs survival in CaP and that the racial disparities in survival is influenced by variation in HT and primary therapies as well as SES. To address these overall hypothesis, we first utilized a random-effect meta-analytic design to examine evidence from randomized trials on the efficacy of androgen deprivation therapy in localized and metastatic disease, and assessed, using Cox proportional hazards models, the effectiveness of HT in prolonging survival in a large community-based cohort of older males diagnosed with local/regional CaP. Further we examined the role of HT and primary therapies on the racial disparities in CaP survival. The results indicated that adjuvant HT compared with standard care alone is efficacious in improving overall survival, whereas HT has no significant benefit in the real world experience in increasing the overall survival of older males in the community treated for local/regional disease. Further, racial differences in survival persist and were explained to some extent by the differences in the primary therapies (radical prostatectomy, radiation and watchful waiting) and largely by SES. Therefore, given the increased used of hormonal therapy and the cost-effectiveness today, more RCTs are needed to assess whether or not survival prolongation translates to improved quality of life, and to answer the research question on whether or not the decreased use of radical prostatectomy by AAs is driven by the Clinicians bias or AAs's preference of conservative therapy and to encourage AAs to seek curative therapies, thus narrowing to some degree the persistent mortality disparities between AAs and Caucasians. ^
The determinants of improvements in health outcomes and of cost reduction in hospital inpatient care
Resumo:
This study aims to address two research questions. First, ‘Can we identify factors that are determinants both of improved health outcomes and of reduced costs for hospitalized patients with one of six common diagnoses?’ Second, ‘Can we identify other factors that are determinants of improved health outcomes for such hospitalized patients but which are not associated with costs?’ The Healthcare Cost and Utilization Project (HCUP) Nationwide Inpatient Sample (NIS) database from 2003 to 2006 was employed in this study. The total study sample consisted of hospitals which had at least 30 patients each year for the given diagnosis: 954 hospitals for acute myocardial infarction (AMI), 1552 hospitals for congestive heart failure (CHF), 1120 hospitals for stroke (STR), 1283 hospitals for gastrointestinal hemorrhage (GIH), 979 hospitals for hip fracture (HIP), and 1716 hospitals for pneumonia (PNE). This study used simultaneous equations models to investigate the determinants of improvement in health outcomes and of cost reduction in hospital inpatient care for these six common diagnoses. In addition, the study used instrumental variables and two-stage least squares random effect model for unbalanced panel data estimation. The study concluded that a few factors were determinants of high quality and low cost. Specifically, high specialty was the determinant of high quality and low costs for CHF patients; small hospital size was the determinant of high quality and low costs for AMI patients. Furthermore, CHF patients who were treated in Midwest, South, and West region hospitals had better health outcomes and lower hospital costs than patients who were treated in Northeast region hospitals. Gastrointestinal hemorrhage and pneumonia patients who were treated in South region hospitals also had better health outcomes and lower hospital costs than patients who were treated in Northeast region hospitals. This study found that six non-cost factors were related to health outcomes for a few diagnoses: hospital volume, percentage emergency room admissions for a given diagnosis, hospital competition, specialty, bed size, and hospital region.^
Resumo:
Mixed longitudinal designs are important study designs for many areas of medical research. Mixed longitudinal studies have several advantages over cross-sectional or pure longitudinal studies, including shorter study completion time and ability to separate time and age effects, thus are an attractive choice. Statistical methodology used in general longitudinal studies has been rapidly developing within the last few decades. Common approaches for statistical modeling in studies with mixed longitudinal designs have been the linear mixed-effects model incorporating an age or time effect. The general linear mixed-effects model is considered an appropriate choice to analyze repeated measurements data in longitudinal studies. However, common use of linear mixed-effects model on mixed longitudinal studies often incorporates age as the only random-effect but fails to take into consideration the cohort effect in conducting statistical inferences on age-related trajectories of outcome measurements. We believe special attention should be paid to cohort effects when analyzing data in mixed longitudinal designs with multiple overlapping cohorts. Thus, this has become an important statistical issue to address. ^ This research aims to address statistical issues related to mixed longitudinal studies. The proposed study examined the existing statistical analysis methods for the mixed longitudinal designs and developed an alternative analytic method to incorporate effects from multiple overlapping cohorts as well as from different aged subjects. The proposed study used simulation to evaluate the performance of the proposed analytic method by comparing it with the commonly-used model. Finally, the study applied the proposed analytic method to the data collected by an existing study Project HeartBeat!, which had been evaluated using traditional analytic techniques. Project HeartBeat! is a longitudinal study of cardiovascular disease (CVD) risk factors in childhood and adolescence using a mixed longitudinal design. The proposed model was used to evaluate four blood lipids adjusting for age, gender, race/ethnicity, and endocrine hormones. The result of this dissertation suggest the proposed analytic model could be a more flexible and reliable choice than the traditional model in terms of fitting data to provide more accurate estimates in mixed longitudinal studies. Conceptually, the proposed model described in this study has useful features, including consideration of effects from multiple overlapping cohorts, and is an attractive approach for analyzing data in mixed longitudinal design studies.^
Resumo:
The use of group-randomized trials is particularly widespread in the evaluation of health care, educational, and screening strategies. Group-randomized trials represent a subset of a larger class of designs often labeled nested, hierarchical, or multilevel and are characterized by the randomization of intact social units or groups, rather than individuals. The application of random effects models to group-randomized trials requires the specification of fixed and random components of the model. The underlying assumption is usually that these random components are normally distributed. This research is intended to determine if the Type I error rate and power are affected when the assumption of normality for the random component representing the group effect is violated. ^ In this study, simulated data are used to examine the Type I error rate, power, bias and mean squared error of the estimates of the fixed effect and the observed intraclass correlation coefficient (ICC) when the random component representing the group effect possess distributions with non-normal characteristics, such as heavy tails or severe skewness. The simulated data are generated with various characteristics (e.g. number of schools per condition, number of students per school, and several within school ICCs) observed in most small, school-based, group-randomized trials. The analysis is carried out using SAS PROC MIXED, Version 6.12, with random effects specified in a random statement and restricted maximum likelihood (REML) estimation specified. The results from the non-normally distributed data are compared to the results obtained from the analysis of data with similar design characteristics but normally distributed random effects. ^ The results suggest that the violation of the normality assumption for the group component by a skewed or heavy-tailed distribution does not appear to influence the estimation of the fixed effect, Type I error, and power. Negative biases were detected when estimating the sample ICC and dramatically increased in magnitude as the true ICC increased. These biases were not as pronounced when the true ICC was within the range observed in most group-randomized trials (i.e. 0.00 to 0.05). The normally distributed group effect also resulted in bias ICC estimates when the true ICC was greater than 0.05. However, this may be a result of higher correlation within the data. ^
Resumo:
Random Forests™ is reported to be one of the most accurate classification algorithms in complex data analysis. It shows excellent performance even when most predictors are noisy and the number of variables is much larger than the number of observations. In this thesis Random Forests was applied to a large-scale lung cancer case-control study. A novel way of automatically selecting prognostic factors was proposed. Also, synthetic positive control was used to validate Random Forests method. Throughout this study we showed that Random Forests can deal with large number of weak input variables without overfitting. It can account for non-additive interactions between these input variables. Random Forests can also be used for variable selection without being adversely affected by collinearities. ^ Random Forests can deal with the large-scale data sets without rigorous data preprocessing. It has robust variable importance ranking measure. Proposed is a novel variable selection method in context of Random Forests that uses the data noise level as the cut-off value to determine the subset of the important predictors. This new approach enhanced the ability of the Random Forests algorithm to automatically identify important predictors for complex data. The cut-off value can also be adjusted based on the results of the synthetic positive control experiments. ^ When the data set had high variables to observations ratio, Random Forests complemented the established logistic regression. This study suggested that Random Forests is recommended for such high dimensionality data. One can use Random Forests to select the important variables and then use logistic regression or Random Forests itself to estimate the effect size of the predictors and to classify new observations. ^ We also found that the mean decrease of accuracy is a more reliable variable ranking measurement than mean decrease of Gini. ^