909 resultados para data accuracy


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Clinical Research Data Quality Literature Review and Pooled Analysis We present a literature review and secondary analysis of data accuracy in clinical research and related secondary data uses. A total of 93 papers meeting our inclusion criteria were categorized according to the data processing methods. Quantitative data accuracy information was abstracted from the articles and pooled. Our analysis demonstrates that the accuracy associated with data processing methods varies widely, with error rates ranging from 2 errors per 10,000 files to 5019 errors per 10,000 fields. Medical record abstraction was associated with the highest error rates (70–5019 errors per 10,000 fields). Data entered and processed at healthcare facilities had comparable error rates to data processed at central data processing centers. Error rates for data processed with single entry in the presence of on-screen checks were comparable to double entered data. While data processing and cleaning methods may explain a significant amount of the variability in data accuracy, additional factors not resolvable here likely exist. Defining Data Quality for Clinical Research: A Concept Analysis Despite notable previous attempts by experts to define data quality, the concept remains ambiguous and subject to the vagaries of natural language. This current lack of clarity continues to hamper research related to data quality issues. We present a formal concept analysis of data quality, which builds on and synthesizes previously published work. We further posit that discipline-level specificity may be required to achieve the desired definitional clarity. To this end, we combine work from the clinical research domain with findings from the general data quality literature to produce a discipline-specific definition and operationalization for data quality in clinical research. While the results are helpful to clinical research, the methodology of concept analysis may be useful in other fields to clarify data quality attributes and to achieve operational definitions. Medical Record Abstractor’s Perceptions of Factors Impacting the Accuracy of Abstracted Data Medical record abstraction (MRA) is known to be a significant source of data errors in secondary data uses. Factors impacting the accuracy of abstracted data are not reported consistently in the literature. Two Delphi processes were conducted with experienced medical record abstractors to assess abstractor’s perceptions about the factors. The Delphi process identified 9 factors that were not found in the literature, and differed with the literature by 5 factors in the top 25%. The Delphi results refuted seven factors reported in the literature as impacting the quality of abstracted data. The results provide insight into and indicate content validity of a significant number of the factors reported in the literature. Further, the results indicate general consistency between the perceptions of clinical research medical record abstractors and registry and quality improvement abstractors. Distributed Cognition Artifacts on Clinical Research Data Collection Forms Medical record abstraction, a primary mode of data collection in secondary data use, is associated with high error rates. Distributed cognition in medical record abstraction has not been studied as a possible explanation for abstraction errors. We employed the theory of distributed representation and representational analysis to systematically evaluate cognitive demands in medical record abstraction and the extent of external cognitive support employed in a sample of clinical research data collection forms. We show that the cognitive load required for abstraction in 61% of the sampled data elements was high, exceedingly so in 9%. Further, the data collection forms did not support external cognition for the most complex data elements. High working memory demands are a possible explanation for the association of data errors with data elements requiring abstractor interpretation, comparison, mapping or calculation. The representational analysis used here can be used to identify data elements with high cognitive demands.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The increasing number of extreme rainfall events, combined with the high population density and the imperviousness of the land surface, makes urban areas particularly vulnerable to pluvial flooding. In order to design and manage cities to be able to deal with this issue, the reconstruction of weather phenomena is essential. Among the most interesting data sources which show great potential are the observational networks of private sensors managed by citizens (crowdsourcing). The number of these personal weather stations is consistently increasing, and the spatial distribution roughly follows population density. Precisely for this reason, they perfectly suit this detailed study on the modelling of pluvial flood in urban environments. The uncertainty associated with these measurements of precipitation is still a matter of research. In order to characterise the accuracy and precision of the crowdsourced data, we carried out exploratory data analyses. A comparison between Netatmo hourly precipitation amounts and observations of the same quantity from weather stations managed by national weather services is presented. The crowdsourced stations have very good skills in rain detection but tend to underestimate the reference value. In detail, the accuracy and precision of crowd- sourced data change as precipitation increases, improving the spread going to the extreme values. Then, the ability of this kind of observation to improve the prediction of pluvial flooding is tested. To this aim, the simplified raster-based inundation model incorporated in the Saferplaces web platform is used for simulating pluvial flooding. Different precipitation fields have been produced and tested as input in the model. Two different case studies are analysed over the most densely populated Norwegian city: Oslo. The crowdsourced weather station observations, bias-corrected (i.e. increased by 25%), showed very good skills in detecting flooded areas.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

BACKGROUND: Co-morbidity information derived from administrative data needs to be validated to allow its regular use. We assessed evolution in the accuracy of coding for Charlson and Elixhauser co-morbidities at three time points over a 5-year period, following the introduction of the International Classification of Diseases, 10th Revision (ICD-10), coding of hospital discharges.METHODS: Cross-sectional time trend evaluation study of coding accuracy using hospital chart data of 3'499 randomly selected patients who were discharged in 1999, 2001 and 2003, from two teaching and one non-teaching hospital in Switzerland. We measured sensitivity, positive predictive and Kappa values for agreement between administrative data coded with ICD-10 and chart data as the 'reference standard' for recording 36 co-morbidities.RESULTS: For the 17 the Charlson co-morbidities, the sensitivity - median (min-max) - was 36.5% (17.4-64.1) in 1999, 42.5% (22.2-64.6) in 2001 and 42.8% (8.4-75.6) in 2003. For the 29 Elixhauser co-morbidities, the sensitivity was 34.2% (1.9-64.1) in 1999, 38.6% (10.5-66.5) in 2001 and 41.6% (5.1-76.5) in 2003. Between 1999 and 2003, sensitivity estimates increased for 30 co-morbidities and decreased for 6 co-morbidities. The increase in sensitivities was statistically significant for six conditions and the decrease significant for one. Kappa values were increased for 29 co-morbidities and decreased for seven.CONCLUSIONS: Accuracy of administrative data in recording clinical conditions improved slightly between 1999 and 2003. These findings are of relevance to all jurisdictions introducing new coding systems, because they demonstrate a phenomenon of improved administrative data accuracy that may relate to a coding 'learning curve' with the new coding system.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

There are two main objects in this study: First, to prove the importance of data accuracy to the business success, and second, create a tool for observing and improving the accuracy of ERP systems production master data. Sub-objective is to explain the need for new tool in client company and the meaning of it for the company. In the theoretical part of this thesis the focus is in stating the importance of data accuracy in decision making and it's implications on business success. Also basics of manufacturing planning are introduced in order to explain the key vocabulary. In the empirical part the client company and its need for this study is introduced. New master data report is introduced, and finally, analysing the report and actions based on the results of analysis are explained. The main results of this thesis are finding the interdependence between data accuracy and business success, and providing a report for continuous master data improvement in the client company's ERP system.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Organizations across the globe are creating and distributing products that include open source software. To ensure compliance with the open source licenses, each company needs to evaluate exactly what open source licenses and copyrights are included - resulting in duplicated effort and redundancy. This talk will provide an overview of a new Software Package Data Exchange (SPDX) specification. This specification will provide a common format to share information about the open source licenses and copyrights that are included in any software package, with the goal of saving time and improving data accuracy. This talk will review the progress of the initiative; discuss the benefits to organizations using open source and share information on how you can contribute.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The accurate measurement of a vehicle’s velocity is an essential feature in adaptive vehicle activated sign systems. Since the velocities of the vehicles are acquired from a continuous wave Doppler radar, the data collection becomes challenging. Data accuracy is sensitive to the calibration of the radar on the road. However, clear methodologies for in-field calibration have not been carefully established. The signs are often installed by subjective judgment which results in measurement errors. This paper develops a calibration method based on mining the data collected and matching individual vehicles travelling between two radars. The data was cleaned and prepared in two ways: cleaning and reconstructing. The results showed that the proposed correction factor derived from the cleaned data corresponded well with the experimental factor done on site. In addition, this proposed factor showed superior performance to the one derived from the reconstructed data.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Increasing amounts of clinical research data are collected by manual data entry into electronic source systems and directly from research subjects. For this manual entered source data, common methods of data cleaning such as post-entry identification and resolution of discrepancies and double data entry are not feasible. However data accuracy rates achieved without these mechanisms may be higher than desired for a particular research use. We evaluated a heuristic usability method for utility as a tool to independently and prospectively identify data collection form questions associated with data errors. The method evaluated had a promising sensitivity of 64% and a specificity of 67%. The method was used as described in the literature for usability with no further adaptations or specialization for predicting data errors. We conclude that usability evaluation methodology should be further investigated for use in data quality assurance.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Over the last decade, several hundred seals have been equipped with conductivity-temperature-depth sensors in the Southern Ocean for both biological and physical oceanographic studies. A calibrated collection of seal-derived hydrographic data is now available, consisting of more than 165,000 profiles. The value of these hydrographic data within the existing Southern Ocean observing system is demonstrated herein by conducting two state estimation experiments, differing only in the use or not of seal data to constrain the system. Including seal-derived data substantially modifies the estimated surface mixedlayer properties and circulation patterns within and south of the Antarctic Circumpolar Current. Agreement with independent satellite observations of sea ice concentration is improved, especially along the East Antarctic shelf. Instrumented animals efficiently reduce a critical observational gap, and their contribution to monitoring polar climate variability will continue to grow as data accuracy and spatial coverage increase.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The engineering of solar power applications, such as photovoltaic energy (PV) or thermal solar energy requires the knowledge of the solar resource available for the solar energy system. This solar resource is generally obtained from datasets, and is either measured by ground-stations, through the use of pyranometers, or by satellites. The solar irradiation data are generally not free, and their cost can be high, in particular if high temporal resolution is required, such as hourly data. In this work, we present an alternative method to provide free hourly global solar tilted irradiation data for the whole European territory through a web platform. The method that we have developed generates solar irradiation data from a combination of clear-sky simulations and weather conditions data. The results are publicly available for free through Soweda, a Web interface. To our knowledge, this is the first time that hourly solar irradiation data are made available online, in real-time, and for free, to the public. The accuracy of these data is not suitable for applications that require high data accuracy, but can be very useful for other applications that only require a rough estimate of solar irradiation.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Large amounts of information can be overwhelming and costly to process, especially when transmitting data over a network. A typical modern Geographical Information System (GIS) brings all types of data together based on the geographic component of the data and provides simple point-and-click query capabilities as well as complex analysis tools. Querying a Geographical Information System, however, can be prohibitively expensive due to the large amounts of data which may need to be processed. Since the use of GIS technology has grown dramatically in the past few years, there is now a need more than ever, to provide users with the fastest and least expensive query capabilities, especially since an approximated 80 % of data stored in corporate databases has a geographical component. However, not every application requires the same, high quality data for its processing. In this paper we address the issues of reducing the cost and response time of GIS queries by preaggregating data by compromising the data accuracy and precision. We present computational issues in generation of multi-level resolutions of spatial data and show that the problem of finding the best approximation for the given region and a real value function on this region, under a predictable error, in general is "NP-complete.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Incomplete reporting has been identified as a major source of avoidable waste in biomedical research.
Essential information is often not provided in study reports, impeding the identification, critical
appraisal, and replication of studies. To improve the quality of reporting of diagnostic accuracy
studies, the Standards for Reporting Diagnostic Accuracy (STARD) statement was developed. Here
we present STARD 2015, an updated list of 30 essential items that should be included in every
report of a diagnostic accuracy study. This update incorporates recent evidence about sources of
bias and variability in diagnostic accuracy and is intended to facilitate the use of STARD. As such,
STARD 2015 may help to improve completeness and transparency in reporting of diagnostic accuracy
studies.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We present a new set of oscillator strengths for 142 Fe II lines in the wavelength range 4000-8000 angstrom. Our gf-values are both accurate and precise, because each multiplet was globally normalized using laboratory data ( accuracy), while the relative gf-values of individual lines within a given multiplet were obtained from theoretical calculations ( precision). Our line list was tested with the Sun and high-resolution (R approximate to 10(5)), high-S/N (approximate to 700-900) Keck+HIRES spectra of the metal-poor stars HD 148816 and HD 140283, for which line-to-line scatter (sigma) in the iron abundances from Fe II lines as low as 0.03, 0.04, and 0.05 dex are found, respectively. For these three stars the standard error in the mean iron abundance from Fe II lines is negligible (sigma(mean) <= 0.01 dex). The mean solar iron abundance obtained using our gf-values and different model atmospheres is A(Fe) = 7.45(sigma = 0.02).

Relevância:

60.00% 60.00%

Publicador:

Resumo:

7th Mediterranean Conference on Information Systems, MCIS 2012, Guimaraes, Portugal, September 8-10, 2012, Proceedings Series: Lecture Notes in Business Information Processing, Vol. 129

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Thesis submitted to the Instituto Superior de Estatística e Gestão de Informação da Universidade Nova de Lisboa in partial fulfillment of the requirements for the Degree of Doctor of Philosophy in Information Management – Geographic Information Systems

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This is a study of a state of the art implementation of a new computer integrated testing (CIT) facility within a company that designs and manufactures transport refrigeration systems. The aim was to use state of the art hardware, software and planning procedures in the design and implementation of three CIT systems. Typical CIT system components include data acquisition (DAQ) equipment, application and analysis software, communication devices, computer-based instrumentation and computer technology. It is shown that the introduction of computer technology into the area of testing can have a major effect on such issues as efficiency, flexibility, data accuracy, test quality, data integrity and much more. Findings reaffirm how the overall area of computer integration continues to benefit any organisation, but with more recent advances in computer technology, communication methods and software capabilities, less expensive more sophisticated test solutions are now possible. This allows more organisations to benefit from the many advantages associated with CIT. Examples of computer integration test set-ups and the benefits associated with computer integration have been discussed.