Biblioteca Digital

970 resultados para data quality issues

Managing data quality in a statistical agency

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Includes bibliography

Veja mais

Data quality in national statistics institutes

Relevância:

100.00% 100.00%

Publicador:

Veja mais

Antimicrobial use in Swiss dairy farms: quantification and evaluation of data quality

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Data on antimicrobial use play a key role in the development of policies for the containment of antimicrobial resistance. On-farm data could provide a detailed overview of the antimicrobial use, but technical and methodological aspects of data collection and interpretation, as well as data quality need to be further assessed. The aims of this study were (1) to quantify antimicrobial use in the study population using different units of measurement and contrast the results obtained, (2) to evaluate data quality of farm records on antimicrobial use, and (3) to compare data quality of different recording systems. During 1 year, data on antimicrobial use were collected from 97 dairy farms. Antimicrobial consumption was quantified using: (1) the incidence density of antimicrobial treatments; (2) the weight of active substance; (3) the used daily dose and (4) the used course dose for antimicrobials for intestinal, intrauterine and systemic use; and (5) the used unit dose, for antimicrobials for intramammary use. Data quality was evaluated by describing completeness and accuracy of the recorded information, and by comparing farmers' and veterinarians' records. Relative consumption of antimicrobials depended on the unit of measurement: used doses reflected the treatment intensity better than weight of active substance. The use of antimicrobials classified as high priority was low, although under- and overdosing were frequently observed. Electronic recording systems allowed better traceability of the animals treated. Recording drug name or dosage often resulted in incomplete or inaccurate information. Veterinarians tended to record more drugs than farmers. The integration of veterinarian and farm data would improve data quality.

Veja mais

Data Quality Assessment of Ungated Flow Cytometry Data in High

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: The recent development of semi-automated techniques for staining and analyzing flow cytometry samples has presented new challenges. Quality control and quality assessment are critical when developing new high throughput technologies and their associated information services. Our experience suggests that significant bottlenecks remain in the development of high throughput flow cytometry methods for data analysis and display. Especially, data quality control and quality assessment are crucial steps in processing and analyzing high throughput flow cytometry data. Methods: We propose a variety of graphical exploratory data analytic tools for exploring ungated flow cytometry data. We have implemented a number of specialized functions and methods in the Bioconductor package rflowcyt. We demonstrate the use of these approaches by investigating two independent sets of high throughput flow cytometry data. Results: We found that graphical representations can reveal substantial non-biological differences in samples. Empirical Cumulative Distribution Function and summary scatterplots were especially useful in the rapid identification of problems not identified by manual review. Conclusions: Graphical exploratory data analytic tools are quick and useful means of assessing data quality. We propose that the described visualizations should be used as quality assessment tools and where possible, be used for quality control.

Veja mais

Electronic medical record systems, data quality and loss to follow-up: survey of antiretroviral therapy programmes in resource-limited settings

Relevância:

100.00% 100.00%

Publicador:

Resumo:

OBJECTIVE: To describe the electronic medical databases used in antiretroviral therapy (ART) programmes in lower-income countries and assess the measures such programmes employ to maintain and improve data quality and reduce the loss of patients to follow-up. METHODS: In 15 countries of Africa, South America and Asia, a survey was conducted from December 2006 to February 2007 on the use of electronic medical record systems in ART programmes. Patients enrolled in the sites at the time of the survey but not seen during the previous 12 months were considered lost to follow-up. The quality of the data was assessed by computing the percentage of missing key variables (age, sex, clinical stage of HIV infection, CD4+ lymphocyte count and year of ART initiation). Associations between site characteristics (such as number of staff members dedicated to data management), measures to reduce loss to follow-up (such as the presence of staff dedicated to tracing patients) and data quality and loss to follow-up were analysed using multivariate logit models. FINDINGS: Twenty-one sites that together provided ART to 50 060 patients were included (median number of patients per site: 1000; interquartile range, IQR: 72-19 320). Eighteen sites (86%) used an electronic database for medical record-keeping; 15 (83%) such sites relied on software intended for personal or small business use. The median percentage of missing data for key variables per site was 10.9% (IQR: 2.0-18.9%) and declined with training in data management (odds ratio, OR: 0.58; 95% confidence interval, CI: 0.37-0.90) and weekly hours spent by a clerk on the database per 100 patients on ART (OR: 0.95; 95% CI: 0.90-0.99). About 10 weekly hours per 100 patients on ART were required to reduce missing data for key variables to below 10%. The median percentage of patients lost to follow-up 1 year after starting ART was 8.5% (IQR: 4.2-19.7%). Strategies to reduce loss to follow-up included outreach teams, community-based organizations and checking death registry data. Implementation of all three strategies substantially reduced losses to follow-up (OR: 0.17; 95% CI: 0.15-0.20). CONCLUSION: The quality of the data collected and the retention of patients in ART treatment programmes are unsatisfactory for many sites involved in the scale-up of ART in resource-limited settings, mainly because of insufficient staff trained to manage data and trace patients lost to follow-up.

Veja mais

Analysis of in-cylinder pressure transducer data quality utilizing a SIDI turbocharged engine

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In-cylinder pressure transducers have been used for decades to record combustion pressure inside a running engine. However, due to the extreme operating environment, transducer design and installation must be considered in order to minimize measurement error. One such error is caused by thermal shock, where the pressure transducer experiences a high heat flux that can distort the pressure transducer diaphragm and also change the crystal sensitivity. This research focused on investigating the effects of thermal shock on in-cylinder pressure transducer data quality using a 2.0L, four-cylinder, spark-ignited, direct-injected, turbo-charged GM engine. Cylinder four was modified with five ports to accommodate pressure transducers of different manufacturers. They included an AVL GH14D, an AVL GH15D, a Kistler 6125C, and a Kistler 6054AR. The GH14D, GH15D, and 6054AR were M5 size transducers. The 6125C was a larger, 6.2mm transducer. Note that both of the AVL pressure transducers utilized a PH03 flame arrestor. Sweeps of ignition timing (spark sweep), engine speed, and engine load were performed to study the effects of thermal shock on each pressure transducer. The project consisted of two distinct phases which included experimental engine testing as well as simulation using a commercially available software package. A comparison was performed to characterize the quality of the data between the actual cylinder pressure and the simulated results. This comparison was valuable because the simulation results did not include thermal shock effects. All three sets of tests showed the peak cylinder pressure was basically unaffected by thermal shock. Comparison of the experimental data with the simulated results showed very good correlation. The spark sweep was performed at 1300 RPM and 3.3 bar NMEP and showed that the differences between the simulated results (no thermal shock) and the experimental data for the indicated mean effective pressure (IMEP) and the pumping mean effective pressure (PMEP) were significantly less than the published accuracies. All transducers had an IMEP percent difference less than 0.038% and less than 0.32% for PMEP. Kistler and AVL publish that the accuracy of their pressure transducers are within plus or minus 1% for the IMEP (AVL 2011; Kistler 2011). In addition, the difference in average exhaust absolute pressure between the simulated results and experimental data was the greatest for the two Kistler pressure transducers. The location and lack of flame arrestor are believed to be the cause of the increased error. For the engine speed sweep, the torque output was held constant at 203 Nm (150 ft-lbf) from 1500 to 4000 RPM. The difference in IMEP was less than 0.01% and the PMEP was less than 1%, except for the AVL GH14D which was 5% and the AVL GH15DK which was 2.25%. A noticeable error in PMEP appeared as the load increased during the engine speed sweeps, as expected. The load sweep was conducted at 2000 RPM over a range of NMEP from 1.1 to 14 bar. The difference in IMEP values were less 0.08% while the PMEP values were below 1% except for the AVL GH14D which was 1.8% and the AVL GH15DK which was at 1.25%. In-cylinder pressure transducer data quality was effectively analyzed using a combination of experimental data and simulation results. Several criteria can be used to investigate the impact of thermal shock on data quality as well as determine the best location and thermal protection for various transducers.

Veja mais

Water Quality Issues in the Williston Basin, Montana and North Dakota

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Williston basin has been producing oil and gas since the 1950s, but production has increased recently due to use of hydraulic fracturing and horizontal drilling technologies to extract oil and gas from the Bakken and Three Forks Formations. As concern about effects of energy production on surface-water and groundwater quality increases, the characterization of current water-quality conditions is highly important to the scientific community, resource managers, industry, and general public.

Veja mais

Data quality of animal health records on Swiss dairy farms

Relevância:

100.00% 100.00%

Publicador:

Resumo:

High-quality data are essential for veterinary surveillance systems, and their quality can be affected by the source and the method of collection. Data recorded on farms could provide detailed information about the health of a population of animals, but the accuracy of the data recorded by farmers is uncertain. The aims of this study were to evaluate the quality of the data on animal health recorded on 97 Swiss dairy farms, to compare the quality of the data obtained by different recording systems, and to obtain baseline data on the health of the animals on the 97 farms. Data on animal health were collected from the farms for a year. Their quality was evaluated by assessing the completeness and accuracy of the recorded information, and by comparing farmers' and veterinarians' records. The quality of the data provided by the farmers was satisfactory, although electronic recording systems made it easier to trace the animals treated. The farmers tended to record more health-related events than the veterinarians, although this varied with the event considered, and some events were recorded only by the veterinarians. The farmers' attitude towards data collection was positive. Factors such as motivation, feedback, training, and simplicity and standardisation of data collection were important because they influenced the quality of the data.

Veja mais

Data Accuracy in Medical Record Abstraction

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Clinical Research Data Quality Literature Review and Pooled Analysis We present a literature review and secondary analysis of data accuracy in clinical research and related secondary data uses. A total of 93 papers meeting our inclusion criteria were categorized according to the data processing methods. Quantitative data accuracy information was abstracted from the articles and pooled. Our analysis demonstrates that the accuracy associated with data processing methods varies widely, with error rates ranging from 2 errors per 10,000 files to 5019 errors per 10,000 fields. Medical record abstraction was associated with the highest error rates (70–5019 errors per 10,000 fields). Data entered and processed at healthcare facilities had comparable error rates to data processed at central data processing centers. Error rates for data processed with single entry in the presence of on-screen checks were comparable to double entered data. While data processing and cleaning methods may explain a significant amount of the variability in data accuracy, additional factors not resolvable here likely exist. Defining Data Quality for Clinical Research: A Concept Analysis Despite notable previous attempts by experts to define data quality, the concept remains ambiguous and subject to the vagaries of natural language. This current lack of clarity continues to hamper research related to data quality issues. We present a formal concept analysis of data quality, which builds on and synthesizes previously published work. We further posit that discipline-level specificity may be required to achieve the desired definitional clarity. To this end, we combine work from the clinical research domain with findings from the general data quality literature to produce a discipline-specific definition and operationalization for data quality in clinical research. While the results are helpful to clinical research, the methodology of concept analysis may be useful in other fields to clarify data quality attributes and to achieve operational definitions. Medical Record Abstractor’s Perceptions of Factors Impacting the Accuracy of Abstracted Data Medical record abstraction (MRA) is known to be a significant source of data errors in secondary data uses. Factors impacting the accuracy of abstracted data are not reported consistently in the literature. Two Delphi processes were conducted with experienced medical record abstractors to assess abstractor’s perceptions about the factors. The Delphi process identified 9 factors that were not found in the literature, and differed with the literature by 5 factors in the top 25%. The Delphi results refuted seven factors reported in the literature as impacting the quality of abstracted data. The results provide insight into and indicate content validity of a significant number of the factors reported in the literature. Further, the results indicate general consistency between the perceptions of clinical research medical record abstractors and registry and quality improvement abstractors. Distributed Cognition Artifacts on Clinical Research Data Collection Forms Medical record abstraction, a primary mode of data collection in secondary data use, is associated with high error rates. Distributed cognition in medical record abstraction has not been studied as a possible explanation for abstraction errors. We employed the theory of distributed representation and representational analysis to systematically evaluate cognitive demands in medical record abstraction and the extent of external cognitive support employed in a sample of clinical research data collection forms. We show that the cognitive load required for abstraction in 61% of the sampled data elements was high, exceedingly so in 9%. Further, the data collection forms did not support external cognition for the most complex data elements. High working memory demands are a possible explanation for the association of data errors with data elements requiring abstractor interpretation, comparison, mapping or calculation. The representational analysis used here can be used to identify data elements with high cognitive demands.

Veja mais

Data quality in trauma transfusion studies and the impact of missing data on predicting massive transfusion

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Maximizing data quality may be especially difficult in trauma-related clinical research. Strategies are needed to improve data quality and assess the impact of data quality on clinical predictive models. This study had two objectives. The first was to compare missing data between two multi-center trauma transfusion studies: a retrospective study (RS) using medical chart data with minimal data quality review and the PRospective Observational Multi-center Major Trauma Transfusion (PROMMTT) study with standardized quality assurance. The second objective was to assess the impact of missing data on clinical prediction algorithms by evaluating blood transfusion prediction models using PROMMTT data. RS (2005-06) and PROMMTT (2009-10) investigated trauma patients receiving ≥ 1 unit of red blood cells (RBC) from ten Level I trauma centers. Missing data were compared for 33 variables collected in both studies using mixed effects logistic regression (including random intercepts for study site). Massive transfusion (MT) patients received ≥ 10 RBC units within 24h of admission. Correct classification percentages for three MT prediction models were evaluated using complete case analysis and multiple imputation based on the multivariate normal distribution. A sensitivity analysis for missing data was conducted to estimate the upper and lower bounds of correct classification using assumptions about missing data under best and worst case scenarios. Most variables (17/33=52%) had <1% missing data in RS and PROMMTT. Of the remaining variables, 50% demonstrated less missingness in PROMMTT, 25% had less missingness in RS, and 25% were similar between studies. Missing percentages for MT prediction variables in PROMMTT ranged from 2.2% (heart rate) to 45% (respiratory rate). For variables missing >1%, study site was associated with missingness (all p≤0.021). Survival time predicted missingness for 50% of RS and 60% of PROMMTT variables. MT models complete case proportions ranged from 41% to 88%. Complete case analysis and multiple imputation demonstrated similar correct classification results. Sensitivity analysis upper-lower bound ranges for the three MT models were 59-63%, 36-46%, and 46-58%. Prospective collection of ten-fold more variables with data quality assurance reduced overall missing data. Study site and patient survival were associated with missingness, suggesting that data were not missing completely at random, and complete case analysis may lead to biased results. Evaluating clinical prediction model accuracy may be misleading in the presence of missing data, especially with many predictor variables. The proposed sensitivity analysis estimating correct classification under upper (best case scenario)/lower (worst case scenario) bounds may be more informative than multiple imputation, which provided results similar to complete case analysis.^