984 resultados para Data Quality


80.00% 80.00%



National estimates of the prevalence of child abuse-related injuries are obtained from a variety of sectors including welfare, justice, and health resulting in inconsistent estimates across sectors. The International Classification of Diseases (ICD) is used as the international standard for categorising health data and aggregating data for statistical purposes, though there has been limited validation of the quality, completeness or concordance of these data with other sectors. This research study examined the quality of documentation and coding of child abuse recorded in hospital records in Queensland and the concordance of these data with child welfare records. A retrospective medical record review was used to examine the clinical documentation of over 1000 hospitalised injured children from 20 hospitals in Queensland. A data linkage methodology was used to link these records with records in the child welfare database. Cases were sampled from three sub-groups according to the presence of target ICD codes: Definite abuse, Possible abuse, unintentional injury. Less than 2% of cases coded as being unintentional were recoded after review as being possible abuse, and only 5% of cases coded as possible abuse cases were reclassified as unintentional, though there was greater variation in the classification of cases as definite abuse compared to possible abuse. Concordance of health data with child welfare data varied across patient subgroups. This study will inform the development of strategies to improve the quality, consistency and concordance of information between health and welfare agencies to ensure adequate system responses to children at risk of abuse.


80.00% 80.00%



This thesis describes the development of a robust and novel prototype to address the data quality problems that relate to the dimension of outlier data. It thoroughly investigates the associated problems with regards to detecting, assessing and determining the severity of the problem of outlier data; and proposes granule-mining based alternative techniques to significantly improve the effectiveness of mining and assessing outlier data.


80.00% 80.00%



Field robots often rely on laser range finders (LRFs) to detect obstacles and navigate autonomously. Despite recent progress in sensing technology and perception algorithms, adverse environmental conditions, such as the presence of smoke, remain a challenging issue for these robots. In this paper, we investigate the possibility to improve laser-based perception applications by anticipating situations when laser data are affected by smoke, using supervised learning and state-of-the-art visual image quality analysis. We propose to train a k-nearest-neighbour (kNN) classifier to recognise situations where a laser scan is likely to be affected by smoke, based on visual data quality features. This method is evaluated experimentally using a mobile robot equipped with LRFs and a visual camera. The strengths and limitations of the technique are identified and discussed, and we show that the method is beneficial if conservative decisions are the most appropriate.


80.00% 80.00%



Health Information Exchange (HIE) is an interesting phenomenon. It is a patient centric health and/or medical information management scenario enhanced by integration of Information and Communication Technologies (ICT). While health information systems are repositioning complex system directives, in the wake of the ‘big data’ paradigm, extracting quality information is challenging. It is anticipated that in this talk, ICT enabled healthcare scenarios with big data analytics will be shared. In addition, research and development regarding big data analytics, such as current trends of using these technologies for health care services and critical research challenges when extracting quality of information to improve quality of life will be discussed.


80.00% 80.00%



This program of research linked police and health data collections to investigate the potential benefits for road safety in terms of enhancing the quality of data. This research has important implications for road safety because, although police collected data has historically underpinned efforts in the area, it is known that many road crashes are not reported to police and that these data lack specific injury severity information. This research shows that data linkage provides a more accurate quantification of the severity and prevalence of road crash injuries which is essential for: prioritising funding; targeting interventions; and estimating the burden and cost of road trauma.


80.00% 80.00%



The Gaia space mission is a major project for the European astronomical community. As challenging as it is, the processing and analysis of the huge data-flow incoming from Gaia is the subject of thorough study and preparatory work by the DPAC (Data Processing and Analysis Consortium), in charge of all aspects of the Gaia data reduction. This PhD Thesis was carried out in the framework of the DPAC, within the team based in Bologna. The task of the Bologna team is to define the calibration model and to build a grid of spectro-photometric standard stars (SPSS) suitable for the absolute flux calibration of the Gaia G-band photometry and the BP/RP spectrophotometry. Such a flux calibration can be performed by repeatedly observing each SPSS during the life-time of the Gaia mission and by comparing the observed Gaia spectra to the spectra obtained by our ground-based observations. Due to both the different observing sites involved and the huge amount of frames expected (≃100000), it is essential to maintain the maximum homogeneity in data quality, acquisition and treatment, and a particular care has to be used to test the capabilities of each telescope/instrument combination (through the “instrument familiarization plan”), to devise methods to keep under control, and eventually to correct for, the typical instrumental effects that can affect the high precision required for the Gaia SPSS grid (a few % with respect to Vega). I contributed to the ground-based survey of Gaia SPSS in many respects: with the observations, the instrument familiarization plan, the data reduction and analysis activities (both photometry and spectroscopy), and to the maintenance of the data archives. However, the field I was personally responsible for was photometry and in particular relative photometry for the production of short-term light curves. In this context I defined and tested a semi-automated pipeline which allows for the pre-reduction of imaging SPSS data and the production of aperture photometry catalogues ready to be used for further analysis. A series of semi-automated quality control criteria are included in the pipeline at various levels, from pre-reduction, to aperture photometry, to light curves production and analysis.


80.00% 80.00%



The Data Quality Campaign (DQC) has been focused since 2005 on advocating for states to build robust state longitudinal data systems (SLDS). While states have made great progress in their data infrastructure, and should continue to emphasize this work, t data systems alone will not improve outcomes. It is time for both DQC and states to focus on building capacity to use the information that these systems are producing at every level – from classrooms to state houses. To impact system performance and student achievement, the ingrained culture must be replaced with one that focuses on data use for continuous improvement. The effective use of data to inform decisions, provide transparency, improve the measurement of outcomes, and fuel continuous improvement will not come to fruition unless there is a system wide focus on building capacity around the collection, analysis, dissemination, and use of this data, including through research.


80.00% 80.00%



As the number of data sources publishing their data on the Web of Data is growing, we are experiencing an immense growth of the Linked Open Data cloud. The lack of control on the published sources, which could be untrustworthy or unreliable, along with their dynamic nature that often invalidates links and causes conflicts or other discrepancies, could lead to poor quality data. In order to judge data quality, a number of quality indicators have been proposed, coupled with quality metrics that quantify the “quality level” of a dataset. In addition to the above, some approaches address how to improve the quality of the datasets through a repair process that focuses on how to correct invalidities caused by constraint violations by either removing or adding triples. In this paper we argue that provenance is a critical factor that should be taken into account during repairs to ensure that the most reliable data is kept. Based on this idea, we propose quality metrics that take into account provenance and evaluate their applicability as repair guidelines in a particular data fusion setting.


80.00% 80.00%



Background and purpose Survey data quality is a combination of the representativeness of the sample, the accuracy and precision of measurements, data processing and management with several subcomponents in each. The purpose of this paper is to show how, in the final risk factor surveys of the WHO MONICA Project, information on data quality were obtained, quantified, and used in the analysis. Methods and results In the WHO MONICA (Multinational MONItoring of trends and determinants in CArdiovascular disease) Project, the information about the data quality components was documented in retrospective quality assessment reports. On the basis of the documented information and the survey data, the quality of each data component was assessed and summarized using quality scores. The quality scores were used in sensitivity testing of the results both by excluding populations with low quality scores and by weighting the data by its quality scores. Conclusions Detailed documentation of all survey procedures with standardized protocols, training, and quality control are steps towards optimizing data quality. Quantifying data quality is a further step. Methods used in the WHO MONICA Project could be adopted to improve quality in other health surveys.


80.00% 80.00%



The evaluation of geospatial data quality and trustworthiness presents a major challenge to geospatial data users when making a dataset selection decision. The research presented here therefore focused on defining and developing a GEO label – a decision support mechanism to assist data users in efficient and effective geospatial dataset selection on the basis of quality, trustworthiness and fitness for use. This thesis thus presents six phases of research and development conducted to: (a) identify the informational aspects upon which users rely when assessing geospatial dataset quality and trustworthiness; (2) elicit initial user views on the GEO label role in supporting dataset comparison and selection; (3) evaluate prototype label visualisations; (4) develop a Web service to support GEO label generation; (5) develop a prototype GEO label-based dataset discovery and intercomparison decision support tool; and (6) evaluate the prototype tool in a controlled human-subject study. The results of the studies revealed, and subsequently confirmed, eight geospatial data informational aspects that were considered important by users when evaluating geospatial dataset quality and trustworthiness, namely: producer information, producer comments, lineage information, compliance with standards, quantitative quality information, user feedback, expert reviews, and citations information. Following an iterative user-centred design (UCD) approach, it was established that the GEO label should visually summarise availability and allow interrogation of these key informational aspects. A Web service was developed to support generation of dynamic GEO label representations and integrated into a number of real-world GIS applications. The service was also utilised in the development of the GEO LINC tool – a GEO label-based dataset discovery and intercomparison decision support tool. The results of the final evaluation study indicated that (a) the GEO label effectively communicates the availability of dataset quality and trustworthiness information and (b) GEO LINC successfully facilitates ‘at a glance’ dataset intercomparison and fitness for purpose-based dataset selection.


70.00% 70.00%



Objective: To examine the sources of coding discrepancy for injury morbidity data and explore the implications of these sources for injury surveillance.-------- Method: An on-site medical record review and recoding study was conducted for 4373 injury-related hospital admissions across Australia. Codes from the original dataset were compared to the recoded data to explore the reliability of coded data aand sources of discrepancy.---------- Results: The most common reason for differences in coding overall was assigning the case to a different external cause category with 8.5% assigned to a different category. Differences in the specificity of codes assigned within a category accounted for 7.8% of coder difference. Differences in intent assignment accounted for 3.7% of the differences in code assignment.---------- Conclusions: In the situation where 8 percent of cases are misclassified by major category, the setting of injury targets on the basis of extent of burden is a somewhat blunt instrument Monitoring the effect of prevention programs aimed at reducing risk factors is not possible in datasets with this level of misclassification error in injury cause subcategories. Future research is needed to build the evidence base around the quality and utility of the ICD classification system and application of use of this for injury surveillance in the hospital environment.


70.00% 70.00%



Participatory sensing enables collection, processing, dissemination and analysis of environmental sensory data by ordinary citizens, through mobile devices. Researchers have recognized the potential of participatory sensing and attempted applying it to many areas. However, participants may submit low quality, misleading, inaccurate, or even malicious data. Therefore, finding a way to improve the data quality has become a significant issue. This study proposes using reputation management to classify the gathered data and provide useful information for campaign organizers and data analysts to facilitate their decisions.


70.00% 70.00%



The health system is one sector dealing with a deluge of complex data. Many healthcare organisations struggle to utilise these volumes of health data effectively and efficiently. Also, there are many healthcare organisations, which still have stand-alone systems, not integrated for management of information and decision-making. This shows, there is a need for an effective system to capture, collate and distribute this health data. Therefore, implementing the data warehouse concept in healthcare is potentially one of the solutions to integrate health data. Data warehousing has been used to support business intelligence and decision-making in many other sectors such as the engineering, defence and retail sectors. The research problem that is going to be addressed is, "how can data warehousing assist the decision-making process in healthcare". To address this problem the researcher has narrowed an investigation focusing on a cardiac surgery unit. This research used the cardiac surgery unit at the Prince Charles Hospital (TPCH) as the case study. The cardiac surgery unit at TPCH uses a stand-alone database of patient clinical data, which supports clinical audit, service management and research functions. However, much of the time, the interaction between the cardiac surgery unit information system with other units is minimal. There is a limited and basic two-way interaction with other clinical and administrative databases at TPCH which support decision-making processes. The aims of this research are to investigate what decision-making issues are faced by the healthcare professionals with the current information systems and how decision-making might be improved within this healthcare setting by implementing an aligned data warehouse model or models. As a part of the research the researcher will propose and develop a suitable data warehouse prototype based on the cardiac surgery unit needs and integrating the Intensive Care Unit database, Clinical Costing unit database (Transition II) and Quality and Safety unit database [electronic discharge summary (e-DS)]. The goal is to improve the current decision-making processes. The main objectives of this research are to improve access to integrated clinical and financial data, providing potentially better information for decision-making for both improved from the questionnaire and by referring to the literature, the results indicate a centralised data warehouse model for the cardiac surgery unit at this stage. A centralised data warehouse model addresses current needs and can also be upgraded to an enterprise wide warehouse model or federated data warehouse model as discussed in the many consulted publications. The data warehouse prototype was able to be developed using SAS enterprise data integration studio 4.2 and the data was analysed using SAS enterprise edition 4.3. In the final stage, the data warehouse prototype was evaluated by collecting feedback from the end users. This was achieved by using output created from the data warehouse prototype as examples of the data desired and possible in a data warehouse environment. According to the feedback collected from the end users, implementation of a data warehouse was seen to be a useful tool to inform management options, provide a more complete representation of factors related to a decision scenario and potentially reduce information product development time. However, there are many constraints exist in this research. For example the technical issues such as data incompatibilities, integration of the cardiac surgery database and e-DS database servers and also, Queensland Health information restrictions (Queensland Health information related policies, patient data confidentiality and ethics requirements), limited availability of support from IT technical staff and time restrictions. These factors have influenced the process for the warehouse model development, necessitating an incremental approach. This highlights the presence of many practical barriers to data warehousing and integration at the clinical service level. Limitations included the use of a small convenience sample of survey respondents, and a single site case report study design. As mentioned previously, the proposed data warehouse is a prototype and was developed using only four database repositories. Despite this constraint, the research demonstrates that by implementing a data warehouse at the service level, decision-making is supported and data quality issues related to access and availability can be reduced, providing many benefits. Output reports produced from the data warehouse prototype demonstrated usefulness for the improvement of decision-making in the management of clinical services, and quality and safety monitoring for better clinical care. However, in the future, the centralised model selected can be upgraded to an enterprise wide architecture by integrating with additional hospital units’ databases.


70.00% 70.00%



A substantial body of literature exists identifying factors contributing to under-performing Enterprise Resource Planning systems (ERPs), including poor communication, lack of executive support and user dissatisfaction (Calisir et al., 2009). Of particular interest is Momoh et al.’s (2010) recent review identifying poor data quality (DQ) as one of nine critical factors associated with ERP failure. DQ is central to ERP operating processes, ERP facilitated decision-making and inter-organizational cooperation (Batini et al., 2009). Crucial in ERP contexts is that the integrated, automated, process driven nature of ERP data flows can amplify DQ issues, compounding minor errors as they flow through the system (Haug et al., 2009; Xu et al., 2002). However, the growing appreciation of the importance of DQ in determining ERP success lacks research addressing the relationship between stakeholders’ requirements and perceptions of ERP DQ, perceived data utility and the impact of users’ treatment of data on ERP outcomes.