8 resultados para Data Quality

em DigitalCommons@The Texas Medical Center


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Maximizing data quality may be especially difficult in trauma-related clinical research. Strategies are needed to improve data quality and assess the impact of data quality on clinical predictive models. This study had two objectives. The first was to compare missing data between two multi-center trauma transfusion studies: a retrospective study (RS) using medical chart data with minimal data quality review and the PRospective Observational Multi-center Major Trauma Transfusion (PROMMTT) study with standardized quality assurance. The second objective was to assess the impact of missing data on clinical prediction algorithms by evaluating blood transfusion prediction models using PROMMTT data. RS (2005-06) and PROMMTT (2009-10) investigated trauma patients receiving ≥ 1 unit of red blood cells (RBC) from ten Level I trauma centers. Missing data were compared for 33 variables collected in both studies using mixed effects logistic regression (including random intercepts for study site). Massive transfusion (MT) patients received ≥ 10 RBC units within 24h of admission. Correct classification percentages for three MT prediction models were evaluated using complete case analysis and multiple imputation based on the multivariate normal distribution. A sensitivity analysis for missing data was conducted to estimate the upper and lower bounds of correct classification using assumptions about missing data under best and worst case scenarios. Most variables (17/33=52%) had <1% missing data in RS and PROMMTT. Of the remaining variables, 50% demonstrated less missingness in PROMMTT, 25% had less missingness in RS, and 25% were similar between studies. Missing percentages for MT prediction variables in PROMMTT ranged from 2.2% (heart rate) to 45% (respiratory rate). For variables missing >1%, study site was associated with missingness (all p≤0.021). Survival time predicted missingness for 50% of RS and 60% of PROMMTT variables. MT models complete case proportions ranged from 41% to 88%. Complete case analysis and multiple imputation demonstrated similar correct classification results. Sensitivity analysis upper-lower bound ranges for the three MT models were 59-63%, 36-46%, and 46-58%. Prospective collection of ten-fold more variables with data quality assurance reduced overall missing data. Study site and patient survival were associated with missingness, suggesting that data were not missing completely at random, and complete case analysis may lead to biased results. Evaluating clinical prediction model accuracy may be misleading in the presence of missing data, especially with many predictor variables. The proposed sensitivity analysis estimating correct classification under upper (best case scenario)/lower (worst case scenario) bounds may be more informative than multiple imputation, which provided results similar to complete case analysis.^

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The Data Quality Campaign (DQC) has been focused since 2005 on advocating for states to build robust state longitudinal data systems (SLDS). While states have made great progress in their data infrastructure, and should continue to emphasize this work, t data systems alone will not improve outcomes. It is time for both DQC and states to focus on building capacity to use the information that these systems are producing at every level – from classrooms to state houses. To impact system performance and student achievement, the ingrained culture must be replaced with one that focuses on data use for continuous improvement. The effective use of data to inform decisions, provide transparency, improve the measurement of outcomes, and fuel continuous improvement will not come to fruition unless there is a system wide focus on building capacity around the collection, analysis, dissemination, and use of this data, including through research.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Increasing amounts of clinical research data are collected by manual data entry into electronic source systems and directly from research subjects. For this manual entered source data, common methods of data cleaning such as post-entry identification and resolution of discrepancies and double data entry are not feasible. However data accuracy rates achieved without these mechanisms may be higher than desired for a particular research use. We evaluated a heuristic usability method for utility as a tool to independently and prospectively identify data collection form questions associated with data errors. The method evaluated had a promising sensitivity of 64% and a specificity of 67%. The method was used as described in the literature for usability with no further adaptations or specialization for predicting data errors. We conclude that usability evaluation methodology should be further investigated for use in data quality assurance.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Clinical Research Data Quality Literature Review and Pooled Analysis We present a literature review and secondary analysis of data accuracy in clinical research and related secondary data uses. A total of 93 papers meeting our inclusion criteria were categorized according to the data processing methods. Quantitative data accuracy information was abstracted from the articles and pooled. Our analysis demonstrates that the accuracy associated with data processing methods varies widely, with error rates ranging from 2 errors per 10,000 files to 5019 errors per 10,000 fields. Medical record abstraction was associated with the highest error rates (70–5019 errors per 10,000 fields). Data entered and processed at healthcare facilities had comparable error rates to data processed at central data processing centers. Error rates for data processed with single entry in the presence of on-screen checks were comparable to double entered data. While data processing and cleaning methods may explain a significant amount of the variability in data accuracy, additional factors not resolvable here likely exist. Defining Data Quality for Clinical Research: A Concept Analysis Despite notable previous attempts by experts to define data quality, the concept remains ambiguous and subject to the vagaries of natural language. This current lack of clarity continues to hamper research related to data quality issues. We present a formal concept analysis of data quality, which builds on and synthesizes previously published work. We further posit that discipline-level specificity may be required to achieve the desired definitional clarity. To this end, we combine work from the clinical research domain with findings from the general data quality literature to produce a discipline-specific definition and operationalization for data quality in clinical research. While the results are helpful to clinical research, the methodology of concept analysis may be useful in other fields to clarify data quality attributes and to achieve operational definitions. Medical Record Abstractor’s Perceptions of Factors Impacting the Accuracy of Abstracted Data Medical record abstraction (MRA) is known to be a significant source of data errors in secondary data uses. Factors impacting the accuracy of abstracted data are not reported consistently in the literature. Two Delphi processes were conducted with experienced medical record abstractors to assess abstractor’s perceptions about the factors. The Delphi process identified 9 factors that were not found in the literature, and differed with the literature by 5 factors in the top 25%. The Delphi results refuted seven factors reported in the literature as impacting the quality of abstracted data. The results provide insight into and indicate content validity of a significant number of the factors reported in the literature. Further, the results indicate general consistency between the perceptions of clinical research medical record abstractors and registry and quality improvement abstractors. Distributed Cognition Artifacts on Clinical Research Data Collection Forms Medical record abstraction, a primary mode of data collection in secondary data use, is associated with high error rates. Distributed cognition in medical record abstraction has not been studied as a possible explanation for abstraction errors. We employed the theory of distributed representation and representational analysis to systematically evaluate cognitive demands in medical record abstraction and the extent of external cognitive support employed in a sample of clinical research data collection forms. We show that the cognitive load required for abstraction in 61% of the sampled data elements was high, exceedingly so in 9%. Further, the data collection forms did not support external cognition for the most complex data elements. High working memory demands are a possible explanation for the association of data errors with data elements requiring abstractor interpretation, comparison, mapping or calculation. The representational analysis used here can be used to identify data elements with high cognitive demands.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The current study investigated data quality and estimated cancer incidence and mortality rates using data provided by Pavlodar, Semipalatinsk and Ust-Kamenogorsk Regional Cancer Registries of Kazakhstan during the period of 1996–1998. Assessment of data quality was performed using standard quality indicators including internal database checks, proportion of cases verified from death certificates only, mortality:incidence ratio, data patterns, proportion of cases with unknown primary site, proportion of cases with unknown age. Crude and age-adjusted incidence and mortality rates and 95% confidence intervals were calculated, by gender, for all cancers combined and for 28 specific cancer sites for each year of the study period. The five most frequent cancers were identified and described for every population. The results of the study provide the first simultaneous assessment of data quality and standardized incidence and mortality rates for Kazakh cancer registries. ^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Racial disparities in prostate cancer are of public health concern. This dissertation used Texas Cancer Registry data to examine racial disparities in prostate cancer incidence for Texas over the period 1995–1998 and subsequent mortality through the year 2001. Incidence, mortality, treatment, and risk factors for survival were examined. It was found that non-Hispanic blacks have higher incidence and mortality from prostate cancer than non-Hispanic whites, and that Hispanics and non-Hispanic Asians are roughly similar to non-Hispanic whites in cancer survival. The incidence rates in non-Hispanic whites were spread more evenly across the age spectrum compared to other racial and ethnic groups. Non-Hispanic blacks were more often diagnosed at a higher stage of disease. All racial and ethnic groups in the Registry had lower death rates from non-prostate cancer causes than non-Hispanic whites. Age, stage and grade all conferred about the same relative risks of all-cause and prostate cancer survival within each racial and ethnic group examined. Radiation treatment for non-Hispanic blacks and Hispanics did not confer a relative risk of survival statistically significantly different from surgery, whereas it conferred greater survival in non-Hispanic whites. However, non-Hispanic blacks were statistically significantly less likely to have received radiation treatment, while controlling for age, stage, and grade. Among only those who died of prostate cancer, non-Hispanic blacks were less likely to have received radiation than were non-Hispanic whites, whereas among those who had not died, non-Hispanic blacks were more likely to have received this treatment. Hispanics were less likely to have received radiation whether they died from prostate cancer or not. All racial and ethnic groups were less likely than Non-Hispanic whites to have received surgery. Non-Hispanic blacks and Hispanics were more likely than non-Hispanic whites to have received hormonal treatment. The findings are interpreted with caution with regard to the limitations of data quality and missing information. Results are discussed in the context of previous work, and public health implications are pondered. This study confirms some earlier findings, identifies treatment as one possible source of disparity in prostate cancer mortality, and contributes to understanding the epidemiology of prostate cancer in Hispanics. ^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Most studies of p53 function have focused on genes transactivated by p53. It is less widely appreciated that p53 can repress target genes to affect a particular cellular response. There is evidence that repression is important for p53-induced apoptosis and cell cycle arrest. It is less clear if repression is important for other p53 functions. A comprehensive knowledge of the genes repressed by p53 and the cellular processes they affect is currently lacking. We used an expression profiling strategy to identify p53-responsive genes following adenoviral p53 gene transfer (Ad-p53) in PC3 prostate cancer cells. A total of 111 genes represented on the Affymetrix U133A microarray were repressed more than two fold (p ≤ 0.05) by p53. An objective assessment of array data quality was carried out using RT-PCR of 20 randomly selected genes. We estimate a confirmation rate of >95.5% for the complete data set. Functional over-representation analysis was used to identify cellular processes potentially affected by p53-mediated repression. Cell cycle regulatory genes exhibited significant enrichment (p ≤ 5E-28) within the repressed targets. Several of these genes are repressed in a p53-dependent manner following DNA damage, but preceding cell cycle arrest. These findings identify novel p53-repressed targets and indicate that p53-induced cell cycle arrest is a function of not only the transactivation of cell cycle inhibitors (e.g., p21), but also the repression of targets that act at each phase of the cell cycle. The mechanism of repression of this set of p53 targets was investigated. Most of the repressed genes identified here do not harbor consensus p53 DNA binding sites but do contain binding sites for E2F transcription factors. We demonstrate a role for E2F/RB repressor complexes in our system. Importantly, p53 is found at the promoter of CDC25A. CDC25A protein is rapidly degraded in response to DNA damage. Our group has demonstrated for the first time that CDC25A is also repressed at the transcript level by p53. This work has important implications for understanding the DNA damage cell cycle checkpoint response and the link between E2F/RB complexes and p53 in the repression of target genes. ^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Introduction: The Texas Occupational Safety & Health Surveillance System (TOSHSS) was created to collect, analyze and interpret occupational injury and illness data in order to decrease the impact of occupational injuries within the state of Texas. This process evaluation was performed midway through the 4-year grant to assess the efficiency and effectiveness of the surveillance system’s planning and implementation activities1. ^ Methods: Two evaluation guidelines published by the Centers for Disease Control and Prevention (CDC) were used as the theoretical models for this process evaluation. The Framework for Program Evaluation in Public Health was used to examine the planning and design of TOSHSS using logic models. The Framework for Evaluating Public Health Surveillance Systems was used to examine the implementation of approximately 60 surveillance activities, including uses of the data obtained from the surveillance system. ^ Results/Discussion: TOSHSS planning activities omitted the creation of a scientific advisory committee and specific activities designed to maintain contacts with stakeholders; and proposed activities should be reassessed and aligned with ongoing performance measurement criteria, including the role of collaborators in helping the surveillance system achieve each proposed activity. TOSHSS implementation activities are substantially meeting expectations and received an overall score of 61% for all activities being performed. TOSHSS is considered a surveillance system that is simple, flexible, acceptable, fairly stable, timely, moderately useful, with good data quality and a PVP of 86%. ^ Conclusions: Through the third year of TOSHSS implementation, the surveillance system is has made a considerable contribution to the collection of occupational injury and illness information within the state of Texas. Implementation of the nine recommendations provided under this process evaluation is expected to increase the overall usefulness of the surveillance system and assist TDSHS in reducing occupational fatalities, injuries, and diseases within the state of Texas. ^ 1 Disclaimer: The Texas Occupational Safety and Health Surveillance System is supported by Grant/Cooperative Agreement Number (U60 OH008473-01A1). The content of the current evaluation are solely the responsibility of the authors and do not necessarily represent the official views of the Centers for Disease Control and Prevention, National Institute for Occupational Safety and Health.^