973 resultados para Biology, Biostatistics|Health Sciences, Pharmacy


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Breast cancer is the most common non-skin cancer and the second leading cause of cancer-related death in women in the United States. Studies on ipsilateral breast tumor relapse (IBTR) status and disease-specific survival will help guide clinic treatment and predict patient prognosis.^ After breast conservation therapy, patients with breast cancer may experience breast tumor relapse. This relapse is classified into two distinct types: true local recurrence (TR) and new ipsilateral primary tumor (NP). However, the methods used to classify the relapse types are imperfect and are prone to misclassification. In addition, some observed survival data (e.g., time to relapse and time from relapse to death)are strongly correlated with relapse types. The first part of this dissertation presents a Bayesian approach to (1) modeling the potentially misclassified relapse status and the correlated survival information, (2) estimating the sensitivity and specificity of the diagnostic methods, and (3) quantify the covariate effects on event probabilities. A shared frailty was used to account for the within-subject correlation between survival times. The inference was conducted using a Bayesian framework via Markov Chain Monte Carlo simulation implemented in softwareWinBUGS. Simulation was used to validate the Bayesian method and assess its frequentist properties. The new model has two important innovations: (1) it utilizes the additional survival times correlated with the relapse status to improve the parameter estimation, and (2) it provides tools to address the correlation between the two diagnostic methods conditional to the true relapse types.^ Prediction of patients at highest risk for IBTR after local excision of ductal carcinoma in situ (DCIS) remains a clinical concern. The goals of the second part of this dissertation were to evaluate a published nomogram from Memorial Sloan-Kettering Cancer Center, to determine the risk of IBTR in patients with DCIS treated with local excision, and to determine whether there is a subset of patients at low risk of IBTR. Patients who had undergone local excision from 1990 through 2007 at MD Anderson Cancer Center with a final diagnosis of DCIS (n=794) were included in this part. Clinicopathologic factors and the performance of the Memorial Sloan-Kettering Cancer Center nomogram for prediction of IBTR were assessed for 734 patients with complete data. Nomogram for prediction of 5- and 10-year IBTR probabilities were found to demonstrate imperfect calibration and discrimination, with an area under the receiver operating characteristic curve of .63 and a concordance index of .63. In conclusion, predictive models for IBTR in DCIS patients treated with local excision are imperfect. Our current ability to accurately predict recurrence based on clinical parameters is limited.^ The American Joint Committee on Cancer (AJCC) staging of breast cancer is widely used to determine prognosis, yet survival within each AJCC stage shows wide variation and remains unpredictable. For the third part of this dissertation, biologic markers were hypothesized to be responsible for some of this variation, and the addition of biologic markers to current AJCC staging were examined for possibly provide improved prognostication. The initial cohort included patients treated with surgery as first intervention at MDACC from 1997 to 2006. Cox proportional hazards models were used to create prognostic scoring systems. AJCC pathologic staging parameters and biologic tumor markers were investigated to devise the scoring systems. Surveillance Epidemiology and End Results (SEER) data was used as the external cohort to validate the scoring systems. Binary indicators for pathologic stage (PS), estrogen receptor status (E), and tumor grade (G) were summed to create PS+EG scoring systems devised to predict 5-year patient outcomes. These scoring systems facilitated separation of the study population into more refined subgroups than the current AJCC staging system. The ability of the PS+EG score to stratify outcomes was confirmed in both internal and external validation cohorts. The current study proposes and validates a new staging system by incorporating tumor grade and ER status into current AJCC staging. We recommend that biologic markers be incorporating into revised versions of the AJCC staging system for patients receiving surgery as the first intervention.^ Chapter 1 focuses on developing a Bayesian method to solve misclassified relapse status and application to breast cancer data. Chapter 2 focuses on evaluation of a breast cancer nomogram for predicting risk of IBTR in patients with DCIS after local excision gives the statement of the problem in the clinical research. Chapter 3 focuses on validation of a novel staging system for disease-specific survival in patients with breast cancer treated with surgery as the first intervention. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Studies have shown that rare genetic variants have stronger effects in predisposing common diseases, and several statistical methods have been developed for association studies involving rare variants. In order to better understand how these statistical methods perform, we seek to compare two recently developed rare variant statistical methods (VT and C-alpha) on 10,000 simulated re-sequencing data sets with disease status and the corresponding 10,000 simulated null data sets. The SLC1A1 gene has been suggested to be associated with diastolic blood pressure (DBP) in previous studies. In the current study, we applied VT and C-alpha methods to the empirical re-sequencing data for the SLC1A1 gene from 300 whites and 200 blacks. We found that VT method obtains higher power and performs better than C-alpha method with the simulated data we used. The type I errors were well-controlled for both methods. In addition, both VT and C-alpha methods suggested no statistical evidence for the association between the SLC1A1 gene and DBP. Overall, our findings provided an important comparison of the two statistical methods for future reference and provided preliminary and pioneer findings on the association between the SLC1A1 gene and blood pressure.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Pneumonia is a well-documented and common respiratory infection in patients with acute traumatic spinal cord injuries, and may recur during the course of acute care. Using data from the North American Clinical Trials Network (NACTN) for Spinal Cord Injury, the incidence, timing, and recurrence of pneumonia were analyzed. The two main objectives were (1) to investigate the time and potential risk factors for the first occurrence of pneumonia using the Cox Proportional Hazards model, and (2) to investigate pneumonia recurrence and its risk factors using a Counting Process model that is a generalization of the Cox Proportional Hazards model. The results from survival analysis suggested that surgery, intubation, American Spinal Injury Association (ASIA) grade, direct admission to a NACTN site and age (older than 65 or not) were significant risks for first event of pneumonia and multiple events of pneumonia. The significance of this research is that it has the potential to identify patients at the time of admission who are at high risk for the incidence and recurrence of pneumonia. Knowledge and the time of occurrence of pneumonias are important factors for the development of prevention strategies and may also provide some insights into the selection of emerging therapies that compromise the immune system. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The tobacco-specific nitrosamine 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone (NNK) is an obvious carcinogen for lung cancer. Since CBMN (Cytokinesis-blocked micronucleus) has been found to be extremely sensitive to NNK-induced genetic damage, it is a potential important factor to predict the lung cancer risk. However, the association between lung cancer and NNK-induced genetic damage measured by CBMN assay has not been rigorously examined. ^ This research develops a methodology to model the chromosomal changes under NNK-induced genetic damage in a logistic regression framework in order to predict the occurrence of lung cancer. Since these chromosomal changes were usually not observed very long due to laboratory cost and time, a resampling technique was applied to generate the Markov chain of the normal and the damaged cell for each individual. A joint likelihood between the resampled Markov chains and the logistic regression model including transition probabilities of this chain as covariates was established. The Maximum likelihood estimation was applied to carry on the statistical test for comparison. The ability of this approach to increase discriminating power to predict lung cancer was compared to a baseline "non-genetic" model. ^ Our method offered an option to understand the association between the dynamic cell information and lung cancer. Our study indicated the extent of DNA damage/non-damage using the CBMN assay provides critical information that impacts public health studies of lung cancer risk. This novel statistical method could simultaneously estimate the process of DNA damage/non-damage and its relationship with lung cancer for each individual.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Choline and betaine are important methyl donors that contribute to protein and phospholipid synthesis and DNA methylation. They can either be obtained through diet or synthesized de novo. Evidence from human and animal research indicates that choline metabolic pathways may be activated during a variety of diseases, including cancer. Studies have been conducted to investigate the role of dietary intake of choline and betaine on cancers, but results vary among studies by cancer types, and no such study had been conducted for lung cancer. We conducted a case-control study to explore the association between choline and betaine dietary intake and lung cancer. A total of 2807 cases and 2919 controls were included in the study. After adjusting for total calorie intake, age, sex, race and smoking status, multivariable logistic regression analysis revealed a significant negative association between choline/betaine intake and lung cancer. Specifically, we observed that higher choline intake was associated with reduced lung cancer odds, and the association did not differ significantly by smoking status. A similar negative trend was observed in the association between betaine intake and lung cancer after adjusting for total calorie intake, age, sex, smoking status, race, and pack-years of smoking. However, this association was strongly affected by smoking. No significant association was observed with increased betaine intake and lung cancer among never smokers, but higher betaine intake was strongly associated with reduced lung cancer odds among smokers, and lower odds ratios were observed among current smokers than among former smokers. Our results suggest that high intake of choline may be protective for lung cancer independent of smoking status, while high betaine intake may mitigate the adverse effect of smoking on lung cancer, and help prevent lung cancer among smokers.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Evaluation of the impact of a disease on life expectancy is an important part of public health. Potential gains in life expectancy (PGLE) that can properly take into account the competing risks are an effective indicator for measuring the impact of the multiple causes of death. This study aimed to measure the PGLEs from reducing/eliminating the major causes of death in the USA from 2001 to 2008. To calculate the PGLEs due to the elimination of specific causes of death, the age-specific mortality rates for heart disease, malignant neoplasms, Alzheimer disease, kidney diseases and HIV/AIDS and life table constructing data were obtained from the National Center for Health Statistics, and the multiple decremental life tables were constructed. The PGLEs by elimination of heart disease, malignant neoplasms or HIV/AIDS continued decreasing from 2001 to 2008, but the PGLE by elimination of Alzheimer's disease or kidney diseases revealed increased trends. The PGLEs (by years) for all race, male, female, white, white male, white female, black, black male and black female at birth by complete elimination of heart disease 2001–2008 were 0.336–0.299, 0.327–0.301, 0.344–0.295, 0.360–0.315, 0.349–0.317, 0.371–0.316,0.278–0.251, 0.272–0.255, and 0.282–0.246 respectively. Similarly, the PGLEs (by years) for all race, male, female, white, white male, white female, black, black male and black female at birth by complete elimination of malignant neoplasms, Alzheimer's disease, kidney disease or HIV/AIDS 2001–2008 were also uncovered, respectively. Most diseases affect specific population, such as, HIV/AIDS tends to have a greater impact on people of working age, heart disease and malignant neoplasms have a greater impact on people over 65 years of age, but Alzheimer's disease and kidney diseases have a greater impact on people over 75 years of age. To measure the impact of these diseases on life expectancy in people of working age, partial multiple decremental life tables were constructed and the PGLEs were computed by partial or complete elimination of various causes of death during the working years. Thus, the results of the study outlined a picture of how each single disease could affect the life expectancy in age-, race-, or sex-specific population in USA. Therefore, the findings would not only assist to evaluate current public health improvements, but also provide useful information for future research and disease control programs.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Trastuzumab is a humanized-monoclonal antibody, developed specifically for HER2-neu over-expressed breast cancer patients. Although highly effective and well tolerated, it was reported associated with Congestive Heart Failure (CHF) in clinical trial settings (up to 27%). This leaves a gap where, Trastuzumab-related CHF rate in general population, especially older breast cancer patients with long term treatment of Trastuzumab remains unknown. This thesis examined the rates and risk factors associated with Trastuzumab-related CHF in a large population of older breast cancer patients. A retrospective cohort study using the existing Surveillance, Epidemiology and End Results (SEER) and Medicare linked de-identified database was performed. Breast cancer patients ≥ 66 years old, stage I-IV, diagnosed in 1998-2007, fully covered by Medicare but no HMO within 1-year before and after first diagnosis month, received 1st chemotherapy no earlier than 30 days prior to diagnosis were selected as study cohort. The primary outcome of this study is a diagnosis of CHF after starting chemotherapy but none CHF claims on or before cancer diagnosis date. ICD-9 and HCPCS codes were used to pool the claims for Trastuzumab use, chemotherapy, comorbidities and CHF claims. Statistical analysis including comparison of characteristics, Kaplan-Meier survival estimates of CHF rates for long term follow up, and Multivariable Cox regression model using Trastuzumab as a time-dependent variable were performed. Out of 17,684 selected cohort, 2,037 (12%) received Trastuzumab. Among them, 35% (714 out of 2037) were diagnosed with CHF, compared to 31% (4784 of 15647) of CHF rate in other chemotherapy recipients (p<.0001). After 10 years of follow-up, 65% of Trastuzumab users developed CHF, compared to 47% in their counterparts. After adjusting for patient demographic, tumor and clinical characteristics, older breast cancer patients who used Trastuzumab showed a significantly higher risk in developing CHF than other chemotherapy recipients (HR 1.69, 95% CI 1.54 - 1.85). And this risk is increased along with the increment of age (p-value < .0001). Among Trastuzumab users, these covariates also significantly increased the risk of CHF: older age, stage IV, Non-Hispanic black race, unmarried, comorbidities, Anthracyclin use, Taxane use, and lower educational level. It is concluded that, Trastuzumab users in older breast cancer patients had 69% higher risk in developing CHF than non-Trastuzumab users, much higher than the 27% increase reported in younger clinical trial patients. Older age, Non-Hispanic black race, unmarried, comorbidity, combined use with Anthracycline or Taxane also significantly increase the risk of CHF development in older patients treated with Trastuzumab. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Complex diseases, such as cancer, are caused by various genetic and environmental factors, and their interactions. Joint analysis of these factors and their interactions would increase the power to detect risk factors but is statistically. Bayesian generalized linear models using student-t prior distributions on coefficients, is a novel method to simultaneously analyze genetic factors, environmental factors, and interactions. I performed simulation studies using three different disease models and demonstrated that the variable selection performance of Bayesian generalized linear models is comparable to that of Bayesian stochastic search variable selection, an improved method for variable selection when compared to standard methods. I further evaluated the variable selection performance of Bayesian generalized linear models using different numbers of candidate covariates and different sample sizes, and provided a guideline for required sample size to achieve a high power of variable selection using Bayesian generalize linear models, considering different scales of number of candidate covariates. ^ Polymorphisms in folate metabolism genes and nutritional factors have been previously associated with lung cancer risk. In this study, I simultaneously analyzed 115 tag SNPs in folate metabolism genes, 14 nutritional factors, and all possible genetic-nutritional interactions from 1239 lung cancer cases and 1692 controls using Bayesian generalized linear models stratified by never, former, and current smoking status. SNPs in MTRR were significantly associated with lung cancer risk across never, former, and current smokers. In never smokers, three SNPs in TYMS and three gene-nutrient interactions, including an interaction between SHMT1 and vitamin B12, an interaction between MTRR and total fat intake, and an interaction between MTR and alcohol use, were also identified as associated with lung cancer risk. These lung cancer risk factors are worthy of further investigation.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The determination of size as well as power of a test is a vital part of a Clinical Trial Design. This research focuses on the simulation of clinical trial data with time-to-event as the primary outcome. It investigates the impact of different recruitment patterns, and time dependent hazard structures on size and power of the log-rank test. A non-homogeneous Poisson process is used to simulate entry times according to the different accrual patterns. A Weibull distribution is employed to simulate survival times according to the different hazard structures. The current study utilizes simulation methods to evaluate the effect of different recruitment patterns on size and power estimates of the log-rank test. The size of the log-rank test is estimated by simulating survival times with identical hazard rates between the treatment and the control arm of the study resulting in a hazard ratio of one. Powers of the log-rank test at specific values of hazard ratio (≠1) are estimated by simulating survival times with different, but proportional hazard rates for the two arms of the study. Different shapes (constant, decreasing, or increasing) of the hazard function of the Weibull distribution are also considered to assess the effect of hazard structure on the size and power of the log-rank test. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objective. In 2009, the International Expert Committee recommended the use of HbA1c test for diagnosis of diabetes. Although it has been recommended for the diagnosis of diabetes, its precise test performance among Mexican Americans is uncertain. A strong “gold standard” would rely on repeated blood glucose measurement on different days, which is the recommended method for diagnosing diabetes in clinical practice. Our objective was to assess test performance of HbA1c in detecting diabetes and pre-diabetes against repeated fasting blood glucose measurement for the Mexican American population living in United States-Mexico border. Moreover, we wanted to find out a specific and precise threshold value of HbA1c for Diabetes Mellitus (DM) and pre-diabetes for this high-risk population which might assist in better diagnosis and better management of patient diabetes. ^ Research design and methods. We used CCHC dataset for our study. In 2004, the Cameron County Hispanic Cohort (CCHC), now numbering 2,574, was established drawn from randomly selected households on the basis of 2000 Census tract data. The CCHC study randomly selected a subset of people (aged 18-64 years) in CCHC cohort households to determine the influence of SES on diabetes and obesity. Among the participants in Cohort-2000, 67.15% are female; all are Hispanic. ^ Individuals were defined as having diabetes mellitus (Fasting plasma glucose [FPG] ≥ 126 mg/dL or pre-diabetes (100 ≤ FPG < 126 mg/dL). HbA1c test performance was evaluated using receiver operator characteristic (ROC) curves. Moreover, change-point models were used to determine HbA1c thresholds compatible with FPG thresholds for diabetes and pre-diabetes. ^ Results. When assessing Fasting Plasma Glucose (FPG) is used to detect diabetes, the sensitivity and specificity of HbA1c≥ 6.5% was 75% and 87% respectively (area under the curve 0.895). Additionally, when assessing FPG to detect pre-diabetes, the sensitivity and specificity of HbA1c≥ 6.0% (ADA recommended threshold) was 18% and 90% respectively. The sensitivity and specificity of HbA1c≥ 5.7% (International Expert Committee recommended threshold) for detecting pre-diabetes was 31% and 78% respectively. ROC analyses suggest HbA1c as a sound predictor of diabetes mellitus (area under the curve 0.895) but a poorer predictor for pre-diabetes (area under the curve 0.632). ^ Conclusions. Our data support the current recommendations for use of HbA1c in the diagnosis of diabetes for the Mexican American population as it has shown reasonable sensitivity, specificity and accuracy against repeated FPG measures. However, use of HbA1c may be premature for detecting pre-diabetes in this specific population because of the poor sensitivity with FPG. It might be the case that HbA1c is differentiating the cases more effectively who are at risk of developing diabetes. Following these pre-diabetic individuals for a longer-term for the detection of incident diabetes may lead to more confirmatory result.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Scholars have found that socioeconomic status was one of the key factors that influenced early-stage lung cancer incidence rates in a variety of regions. This thesis examined the association between median household income and lung cancer incidence rates in Texas counties. A total of 254 individual counties in Texas with corresponding lung cancer incidence rates from 2004 to 2008 and median household incomes in 2006 were collected from the National Cancer Institute Surveillance System. A simple linear model and spatial linear models with two structures, Simultaneous Autoregressive Structure (SAR) and Conditional Autoregressive Structure (CAR), were used to link median household income and lung cancer incidence rates in Texas. The residuals of the spatial linear models were analyzed with Moran's I and Geary's C statistics, and the statistical results were used to detect similar lung cancer incidence rate clusters and disease patterns in Texas.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background. End-stage liver disease (ESLD) is an irreversible condition that leads to the imminent complete failure of the liver. Orthotopic liver transplantation (OLT) has been well accepted as the best curative option for patients with ESLD. Despite the progress in liver transplantation, the major limitation nowadays is the discrepancy between donor supply and organ demand. In an effort to alleviate this situation, mismatched donor and recipient gender or race livers are being used. However, the simultaneous impact of donor and recipient gender and race mismatching on patient survival after OLT remains unclear and relatively challenging to surgeons. ^ Objective. To examine the impact of donor and recipient gender and race mismatching on patient survival after OLT using the United Network for Organ Sharing (UNOS) database. ^ Methods. A total of 40,644 recipients who underwent OLT between 2002 and 2011 were included. Kaplan-Meier survival curves and the log-rank tests were used to compare the survival rates among different donor-recipient gender and race combinations. Univariate Cox regression analysis was used to assess the association of donor-recipient gender and race mismatching with patient survival after OLT. Multivariable Cox regression analysis was used to model the simultaneous impact of donor-recipient gender and race mismatching on patient survival after OLT adjusting for a list of other risk factors. Multivariable Cox regression analysis stratifying on recipient hepatitis C virus (HCV) status was also conducted to identify the variables that were differentially associated with patient survival in HCV + and HCV − recipients. ^ Results. In the univariate analysis, compared to male donors to male recipients, female donors to male recipients had a higher risk of patient mortality (HR, 1.122; 95% CI, 1.065–1.183), while in the multivariable analysis, male donors to female recipients experienced an increased mortality rates (adjusted HR, 1.114; 95% CI, 1.048–1.184). Compared to white donors to white recipients, Hispanic donors to black recipients had a higher risk of patient mortality (HR, 1.527; 95% CI, 1.293–1.804) in the univariate analysis, and similar result (adjusted HR, 1.553; 95% CI, 1.314–1.836) was noted in multivariable analysis. After the stratification on recipient HCV status in the multivariable analysis, HCV + mismatched recipients appeared to be at greater risk of mortality than HCV − mismatched recipients. Female donors to female HCV − recipients (adjusted HR, 0.843; 95% CI, 0.769–0.923), and Hispanic HCV + recipients receiving livers from black donors (adjusted HR, 0.758; 95% CI, 0.598–0.960) had a protective effect on patient survival after OLT. ^ Conclusion. Donor-recipient gender and race mismatching adversely affect patient survival after OLT, both independently and after the adjustment for other risk factors. Female recipient HCV status is an important effect modifier in the association between donor-recipient gender combination and patient survival.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The infant mortality rate (IMR) is considered to be one of the most important indices of a country's well-being. Countries around the world and other health organizations like the World Health Organization are dedicating their resources, knowledge and energy to reduce the infant mortality rates. The well-known Millennium Development Goal 4 (MDG 4), whose aim is to archive a two thirds reduction of the under-five mortality rate between 1990 and 2015, is an example of the commitment. ^ In this study our goal is to model the trends of IMR between the 1950s to 2010s for selected countries. We would like to know how the IMR is changing overtime and how it differs across countries. ^ IMR data collected over time forms a time series. The repeated observations of IMR time series are not statistically independent. So in modeling the trend of IMR, it is necessary to account for these correlations. We proposed to use the generalized least squares method in general linear models setting to deal with the variance-covariance structure in our model. In order to estimate the variance-covariance matrix, we referred to the time-series models, especially the autoregressive and moving average models. Furthermore, we will compared results from general linear model with correlation structure to that from ordinary least squares method without taking into account the correlation structure to check how significantly the estimates change.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Little is known about the effects on patient adherence when the same study drug is administered in the same dose in two populations with two different diseases in two different clinical trials. The Minocycline in Rheumatoid Arthritis (MIRA) trial and the NIH Exploratory Trials in Parkinson's disease (NET-PD) Futility Study I provide a unique opportunity to do the above and to compare methods measuring adherence. This study may increase understanding of the influence of disease and adverse events on patient adherence and will provide insights to investigators selecting adherence assessment methods in clinical trials of minocycline and other drugs in future.^ Methods: Minocycline adherence by pill count and the effect of adverse events was compared in the MIRA and NET-PD FS1 trials using multivariable linear regression. Within the MIRA trial, agreement between assay and pill count was compared. The association of adverse events with assay adherence was examined using multivariable logistic regression.^ Results: Adherence derived from pill count in the MIRA and NET-PD FS1 trials did not differ significantly. Adverse events potentially related to minocycline did not appear useful to predict minocycline adherence. In the MIRA trial, adherence measured by pill count appears higher than adherence measured by assay. Agreement between pill count and assay was poor (kappa statistic = 0.25).^ Limitations: Trial and disease are completely confounded and hence the independent effect of disease on adherence to minocycline treatment cannot be studied.^ Conclusion: Simple pill count may be preferred over assay in the minocycline clinical trials to measure adherence. Assays may be less sensitive in a clinical setting where appointments are not scheduled in relation to medication administration time, given assays depend on many pharmacokinetic and instrument-related factors. However, pill count can be manipulated by the patient. Another study suggested that self-report method is more sensitive than pill count method in differentiating adherence from non-adherence. An effect of medication-related adverse events on adherence could not be detected.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Maximizing data quality may be especially difficult in trauma-related clinical research. Strategies are needed to improve data quality and assess the impact of data quality on clinical predictive models. This study had two objectives. The first was to compare missing data between two multi-center trauma transfusion studies: a retrospective study (RS) using medical chart data with minimal data quality review and the PRospective Observational Multi-center Major Trauma Transfusion (PROMMTT) study with standardized quality assurance. The second objective was to assess the impact of missing data on clinical prediction algorithms by evaluating blood transfusion prediction models using PROMMTT data. RS (2005-06) and PROMMTT (2009-10) investigated trauma patients receiving ≥ 1 unit of red blood cells (RBC) from ten Level I trauma centers. Missing data were compared for 33 variables collected in both studies using mixed effects logistic regression (including random intercepts for study site). Massive transfusion (MT) patients received ≥ 10 RBC units within 24h of admission. Correct classification percentages for three MT prediction models were evaluated using complete case analysis and multiple imputation based on the multivariate normal distribution. A sensitivity analysis for missing data was conducted to estimate the upper and lower bounds of correct classification using assumptions about missing data under best and worst case scenarios. Most variables (17/33=52%) had <1% missing data in RS and PROMMTT. Of the remaining variables, 50% demonstrated less missingness in PROMMTT, 25% had less missingness in RS, and 25% were similar between studies. Missing percentages for MT prediction variables in PROMMTT ranged from 2.2% (heart rate) to 45% (respiratory rate). For variables missing >1%, study site was associated with missingness (all p≤0.021). Survival time predicted missingness for 50% of RS and 60% of PROMMTT variables. MT models complete case proportions ranged from 41% to 88%. Complete case analysis and multiple imputation demonstrated similar correct classification results. Sensitivity analysis upper-lower bound ranges for the three MT models were 59-63%, 36-46%, and 46-58%. Prospective collection of ten-fold more variables with data quality assurance reduced overall missing data. Study site and patient survival were associated with missingness, suggesting that data were not missing completely at random, and complete case analysis may lead to biased results. Evaluating clinical prediction model accuracy may be misleading in the presence of missing data, especially with many predictor variables. The proposed sensitivity analysis estimating correct classification under upper (best case scenario)/lower (worst case scenario) bounds may be more informative than multiple imputation, which provided results similar to complete case analysis.^