953 resultados para DATA QUALITY
Resumo:
The health system is one sector dealing with a deluge of complex data. Many healthcare organisations struggle to utilise these volumes of health data effectively and efficiently. Also, there are many healthcare organisations, which still have stand-alone systems, not integrated for management of information and decision-making. This shows, there is a need for an effective system to capture, collate and distribute this health data. Therefore, implementing the data warehouse concept in healthcare is potentially one of the solutions to integrate health data. Data warehousing has been used to support business intelligence and decision-making in many other sectors such as the engineering, defence and retail sectors. The research problem that is going to be addressed is, "how can data warehousing assist the decision-making process in healthcare". To address this problem the researcher has narrowed an investigation focusing on a cardiac surgery unit. This research used the cardiac surgery unit at the Prince Charles Hospital (TPCH) as the case study. The cardiac surgery unit at TPCH uses a stand-alone database of patient clinical data, which supports clinical audit, service management and research functions. However, much of the time, the interaction between the cardiac surgery unit information system with other units is minimal. There is a limited and basic two-way interaction with other clinical and administrative databases at TPCH which support decision-making processes. The aims of this research are to investigate what decision-making issues are faced by the healthcare professionals with the current information systems and how decision-making might be improved within this healthcare setting by implementing an aligned data warehouse model or models. As a part of the research the researcher will propose and develop a suitable data warehouse prototype based on the cardiac surgery unit needs and integrating the Intensive Care Unit database, Clinical Costing unit database (Transition II) and Quality and Safety unit database [electronic discharge summary (e-DS)]. The goal is to improve the current decision-making processes. The main objectives of this research are to improve access to integrated clinical and financial data, providing potentially better information for decision-making for both improved from the questionnaire and by referring to the literature, the results indicate a centralised data warehouse model for the cardiac surgery unit at this stage. A centralised data warehouse model addresses current needs and can also be upgraded to an enterprise wide warehouse model or federated data warehouse model as discussed in the many consulted publications. The data warehouse prototype was able to be developed using SAS enterprise data integration studio 4.2 and the data was analysed using SAS enterprise edition 4.3. In the final stage, the data warehouse prototype was evaluated by collecting feedback from the end users. This was achieved by using output created from the data warehouse prototype as examples of the data desired and possible in a data warehouse environment. According to the feedback collected from the end users, implementation of a data warehouse was seen to be a useful tool to inform management options, provide a more complete representation of factors related to a decision scenario and potentially reduce information product development time. However, there are many constraints exist in this research. For example the technical issues such as data incompatibilities, integration of the cardiac surgery database and e-DS database servers and also, Queensland Health information restrictions (Queensland Health information related policies, patient data confidentiality and ethics requirements), limited availability of support from IT technical staff and time restrictions. These factors have influenced the process for the warehouse model development, necessitating an incremental approach. This highlights the presence of many practical barriers to data warehousing and integration at the clinical service level. Limitations included the use of a small convenience sample of survey respondents, and a single site case report study design. As mentioned previously, the proposed data warehouse is a prototype and was developed using only four database repositories. Despite this constraint, the research demonstrates that by implementing a data warehouse at the service level, decision-making is supported and data quality issues related to access and availability can be reduced, providing many benefits. Output reports produced from the data warehouse prototype demonstrated usefulness for the improvement of decision-making in the management of clinical services, and quality and safety monitoring for better clinical care. However, in the future, the centralised model selected can be upgraded to an enterprise wide architecture by integrating with additional hospital units’ databases.
Resumo:
A substantial body of literature exists identifying factors contributing to under-performing Enterprise Resource Planning systems (ERPs), including poor communication, lack of executive support and user dissatisfaction (Calisir et al., 2009). Of particular interest is Momoh et al.’s (2010) recent review identifying poor data quality (DQ) as one of nine critical factors associated with ERP failure. DQ is central to ERP operating processes, ERP facilitated decision-making and inter-organizational cooperation (Batini et al., 2009). Crucial in ERP contexts is that the integrated, automated, process driven nature of ERP data flows can amplify DQ issues, compounding minor errors as they flow through the system (Haug et al., 2009; Xu et al., 2002). However, the growing appreciation of the importance of DQ in determining ERP success lacks research addressing the relationship between stakeholders’ requirements and perceptions of ERP DQ, perceived data utility and the impact of users’ treatment of data on ERP outcomes.
Resumo:
Quality oriented management systems and methods have become the dominant business and governance paradigm. From this perspective, satisfying customers’ expectations by supplying reliable, good quality products and services is the key factor for an organization and even government. During recent decades, Statistical Quality Control (SQC) methods have been developed as the technical core of quality management and continuous improvement philosophy and now are being applied widely to improve the quality of products and services in industrial and business sectors. Recently SQC tools, in particular quality control charts, have been used in healthcare surveillance. In some cases, these tools have been modified and developed to better suit the health sector characteristics and needs. It seems that some of the work in the healthcare area has evolved independently of the development of industrial statistical process control methods. Therefore analysing and comparing paradigms and the characteristics of quality control charts and techniques across the different sectors presents some opportunities for transferring knowledge and future development in each sectors. Meanwhile considering capabilities of Bayesian approach particularly Bayesian hierarchical models and computational techniques in which all uncertainty are expressed as a structure of probability, facilitates decision making and cost-effectiveness analyses. Therefore, this research investigates the use of quality improvement cycle in a health vii setting using clinical data from a hospital. The need of clinical data for monitoring purposes is investigated in two aspects. A framework and appropriate tools from the industrial context are proposed and applied to evaluate and improve data quality in available datasets and data flow; then a data capturing algorithm using Bayesian decision making methods is developed to determine economical sample size for statistical analyses within the quality improvement cycle. Following ensuring clinical data quality, some characteristics of control charts in the health context including the necessity of monitoring attribute data and correlated quality characteristics are considered. To this end, multivariate control charts from an industrial context are adapted to monitor radiation delivered to patients undergoing diagnostic coronary angiogram and various risk-adjusted control charts are constructed and investigated in monitoring binary outcomes of clinical interventions as well as postintervention survival time. Meanwhile, adoption of a Bayesian approach is proposed as a new framework in estimation of change point following control chart’s signal. This estimate aims to facilitate root causes efforts in quality improvement cycle since it cuts the search for the potential causes of detected changes to a tighter time-frame prior to the signal. This approach enables us to obtain highly informative estimates for change point parameters since probability distribution based results are obtained. Using Bayesian hierarchical models and Markov chain Monte Carlo computational methods, Bayesian estimators of the time and the magnitude of various change scenarios including step change, linear trend and multiple change in a Poisson process are developed and investigated. The benefits of change point investigation is revisited and promoted in monitoring hospital outcomes where the developed Bayesian estimator reports the true time of the shifts, compared to priori known causes, detected by control charts in monitoring rate of excess usage of blood products and major adverse events during and after cardiac surgery in a local hospital. The development of the Bayesian change point estimators are then followed in a healthcare surveillances for processes in which pre-intervention characteristics of patients are viii affecting the outcomes. In this setting, at first, the Bayesian estimator is extended to capture the patient mix, covariates, through risk models underlying risk-adjusted control charts. Variations of the estimator are developed to estimate the true time of step changes and linear trends in odds ratio of intensive care unit outcomes in a local hospital. Secondly, the Bayesian estimator is extended to identify the time of a shift in mean survival time after a clinical intervention which is being monitored by riskadjusted survival time control charts. In this context, the survival time after a clinical intervention is also affected by patient mix and the survival function is constructed using survival prediction model. The simulation study undertaken in each research component and obtained results highly recommend the developed Bayesian estimators as a strong alternative in change point estimation within quality improvement cycle in healthcare surveillances as well as industrial and business contexts. The superiority of the proposed Bayesian framework and estimators are enhanced when probability quantification, flexibility and generalizability of the developed model are also considered. The empirical results and simulations indicate that the Bayesian estimators are a strong alternative in change point estimation within quality improvement cycle in healthcare surveillances. The superiority of the proposed Bayesian framework and estimators are enhanced when probability quantification, flexibility and generalizability of the developed model are also considered. The advantages of the Bayesian approach seen in general context of quality control may also be extended in the industrial and business domains where quality monitoring was initially developed.
Resumo:
Open Educational Resources (OER) are teaching, learning and research materials that have been released under an open licence that permits online access and re-use by others. The 2012 Paris OER Declaration encourages the open licensing of educational materials produced with public funds. Digital data and data sets produced as a result of scientific and non-scientific research are an increasingly important category of educational materials. This paper discusses the legal challenges presented when publicly funded research data is made available as OER, arising from intellectual property rights, confidentiality and information privacy laws, and the lack of a legal duty to ensure data quality. If these legal challenges are not understood, addressed and effectively managed, they may impede and restrict access to and re-use of research data. This paper identifies some of the legal challenges that need to be addressed and describes 10 proposed best practices which are recommended for adoption to so that publicly funded research data can be made available for access and re-use as OER.
Resumo:
Citizen Science projects are initiatives in which members of the general public participate in scientific research projects and perform or manage research-related tasks such as data collection and/or data annotation. Citizen Science is technologically possible and scientifically significant. However, as the gathered information is from the crowd, the data quality is always hard to manage. There are many ways to manage data quality, and reputation management is one of the common approaches. In recent year, many research teams have deployed many audio or image sensors in natural environment in order to monitor the status of animals or plants. The collected data will be analysed by ecologists. However, as the amount of collected data is exceedingly huge and the number of ecologists is very limited, it is impossible for scientists to manually analyse all these data. The functions of existing automated tools to process the data are still very limited and the results are still not very accurate. Therefore, researchers have turned to recruiting general citizens who are interested in helping scientific research to do the pre-processing tasks such as species tagging. Although research teams can save time and money by recruiting general citizens to volunteer their time and skills to help data analysis, the reliability of contributed data varies a lot. Therefore, this research aims to investigate techniques to enhance the reliability of data contributed by general citizens in scientific research projects especially for acoustic sensing projects. In particular, we aim to investigate how to use reputation management to enhance data reliability. Reputation systems have been used to solve the uncertainty and improve data quality in many marketing and E-Commerce domains. The commercial organizations which have chosen to embrace the reputation management and implement the technology have gained many benefits. Data quality issues are significant to the domain of Citizen Science due to the quantity and diversity of people and devices involved. However, research on reputation management in this area is relatively new. We therefore start our investigation by examining existing reputation systems in different domains. Then we design novel reputation management approaches for Citizen Science projects to categorise participants and data. We have investigated some critical elements which may influence data reliability in Citizen Science projects. These elements include personal information such as location and education and performance information such as the ability to recognise certain bird calls. The designed reputation framework is evaluated by a series of experiments involving many participants for collecting and interpreting data, in particular, environmental acoustic data. Our research in exploring the advantages of reputation management in Citizen Science (or crowdsourcing in general) will help increase awareness among organizations that are unacquainted with its potential benefits.
Resumo:
In this paper, we present WebPut, a prototype system that adopts a novel web-based approach to the data imputation problem. Towards this, Webput utilizes the available information in an incomplete database in conjunction with the data consistency principle. Moreover, WebPut extends effective Information Extraction (IE) methods for the purpose of formulating web search queries that are capable of effectively retrieving missing values with high accuracy. WebPut employs a confidence-based scheme that efficiently leverages our suite of data imputation queries to automatically select the most effective imputation query for each missing value. A greedy iterative algorithm is also proposed to schedule the imputation order of the different missing values in a database, and in turn the issuing of their corresponding imputation queries, for improving the accuracy and efficiency of WebPut. Experiments based on several real-world data collections demonstrate that WebPut outperforms existing approaches.
Resumo:
The Council of Australian Governments (COAG) in 2003 gave in-principle approval to a best-practice report recommending a holistic approach to managing natural disasters in Australia incorporating a move from a traditional response-centric approach to a greater focus on mitigation, recovery and resilience with community well-being at the core. Since that time, there have been a range of complementary developments that have supported the COAG recommended approach. Developments have been administrative, legislative and technological, both, in reaction to the COAG initiative and resulting from regular natural disasters. This paper reviews the characteristics of the spatial data that is becoming increasingly available at Federal, state and regional jurisdictions with respect to their being fit for the purpose for disaster planning and mitigation and strengthening community resilience. In particular, Queensland foundation spatial data, which is increasingly accessible by the public under the provisions of the Right to Information Act 2009, Information Privacy Act 2009, and recent open data reform initiatives are evaluated. The Fitzroy River catchment and floodplain is used as a case study for the review undertaken. The catchment covers an area of 142,545 km2, the largest river catchment flowing to the eastern coast of Australia. The Fitzroy River basin experienced extensive flooding during the 2010–2011 Queensland floods. The basin is an area of important economic, environmental and heritage values and contains significant infrastructure critical for the mining and agricultural sectors, the two most important economic sectors for Queensland State. Consequently, the spatial datasets for this area play a critical role in disaster management and for protecting critical infrastructure essential for economic and community well-being. The foundation spatial datasets are assessed for disaster planning and mitigation purposes using data quality indicators such as resolution, accuracy, integrity, validity and audit trail.
Resumo:
Numerous statements and declarations have been made over recent decades in support of open access to research data. The growing recognition of the importance of open access to research data has been accompanied by calls on public research funding agencies and universities to facilitate better access to publicly funded research data so that it can be re-used and redistributed as public goods. International and inter-governmental bodies such as the ICSU/CODATA, the OECD and the European Union are strong supporters of open access to and re-use of publicly funded research data. This thesis focuses on the research data created by university researchers in Malaysian public universities whose research activities are funded by the Federal Government of Malaysia. Malaysia, like many countries, has not yet formulated a policy on open access to and re-use of publicly funded research data. Therefore, the aim of this thesis is to develop a policy to support the objective of enabling open access to and re-use of publicly funded research data in Malaysian public universities. Policy development is very important if the objective of enabling open access to and re-use of publicly funded research data is to be successfully achieved. In developing the policy, this thesis identifies a myriad of legal impediments arising from intellectual property rights, confidentiality, privacy and national security laws, novelty requirements in patent law and lack of a legal duty to ensure data quality. Legal impediments such as these have the effect of restricting, obstructing, hindering or slowing down the objective of enabling open access to and re-use of publicly funded research data. A key focus in the formulation of the policy was the need to resolve the various legal impediments that have been identified. This thesis analyses the existing policies and guidelines of Malaysian public universities to ascertain to what extent the legal impediments have been resolved. An international perspective is adopted by making a comparative analysis of the policies of public research funding agencies and universities in the United Kingdom, the United States and Australia to understand how they have dealt with the identified legal impediments. These countries have led the way in introducing policies which support open access to and re-use of publicly funded research data. As well as proposing a policy supporting open access to and re-use of publicly funded research data in Malaysian public universities, this thesis provides procedures for the implementation of the policy and guidelines for addressing the legal impediments to open access and re-use.
Resumo:
Objective To synthesise recent research on the use of machine learning approaches to mining textual injury surveillance data. Design Systematic review. Data sources The electronic databases which were searched included PubMed, Cinahl, Medline, Google Scholar, and Proquest. The bibliography of all relevant articles was examined and associated articles were identified using a snowballing technique. Selection criteria For inclusion, articles were required to meet the following criteria: (a) used a health-related database, (b) focused on injury-related cases, AND used machine learning approaches to analyse textual data. Methods The papers identified through the search were screened resulting in 16 papers selected for review. Articles were reviewed to describe the databases and methodology used, the strength and limitations of different techniques, and quality assurance approaches used. Due to heterogeneity between studies meta-analysis was not performed. Results Occupational injuries were the focus of half of the machine learning studies and the most common methods described were Bayesian probability or Bayesian network based methods to either predict injury categories or extract common injury scenarios. Models were evaluated through either comparison with gold standard data or content expert evaluation or statistical measures of quality. Machine learning was found to provide high precision and accuracy when predicting a small number of categories, was valuable for visualisation of injury patterns and prediction of future outcomes. However, difficulties related to generalizability, source data quality, complexity of models and integration of content and technical knowledge were discussed. Conclusions The use of narrative text for injury surveillance has grown in popularity, complexity and quality over recent years. With advances in data mining techniques, increased capacity for analysis of large databases, and involvement of computer scientists in the injury prevention field, along with more comprehensive use and description of quality assurance methods in text mining approaches, it is likely that we will see a continued growth and advancement in knowledge of text mining in the injury field.
Resumo:
Background Historically, the paper hand-held record (PHR) has been used for sharing information between hospital clinicians, general practitioners and pregnant women in a maternity shared-care environment. Recently in alignment with a National e-health agenda, an electronic health record (EHR) was introduced at an Australian tertiary maternity service to replace the PHR for collection and transfer of data. The aim of this study was to examine and compare the completeness of clinical data collected in a PHR and an EHR. Methods We undertook a comparative cohort design study to determine differences in completeness between data collected from maternity records in two phases. Phase 1 data were collected from the PHR and Phase 2 data from the EHR. Records were compared for completeness of best practice variables collected The primary outcome was the presence of best practice variables and the secondary outcomes were the differences in individual variables between the records. Results Ninety-four percent of paper medical charts were available in Phase 1 and 100% of records from an obstetric database in Phase 2. No PHR or EHR had a complete dataset of best practice variables. The variables with significant improvement in completeness of data documented in the EHR, compared with the PHR, were urine culture, glucose tolerance test, nuchal screening, morphology scans, folic acid advice, tobacco smoking, illicit drug assessment and domestic violence assessment (p = 0.001). Additionally the documentation of immunisations (pertussis, hepatitis B, varicella, fluvax) were markedly improved in the EHR (p = 0.001). The variables of blood pressure, proteinuria, blood group, antibody, rubella and syphilis status, showed no significant differences in completeness of recording. Conclusion This is the first paper to report on the comparison of clinical data collected on a PHR and EHR in a maternity shared-care setting. The use of an EHR demonstrated significant improvements to the collection of best practice variables. Additionally, the data in an EHR were more available to relevant clinical staff with the appropriate log-in and more easily retrieved than from the PHR. This study contributes to an under-researched area of determining data quality collected in patient records.
Resumo:
Decision-making is such an integral aspect in health care routine that the ability to make the right decisions at crucial moments can lead to patient health improvements. Evidence-based practice, the paradigm used to make those informed decisions, relies on the use of current best evidence from systematic research such as randomized controlled trials. Limitations of the outcomes from randomized controlled trials (RCT), such as “quantity” and “quality” of evidence generated, has lowered healthcare professionals’ confidence in using EBP. An alternate paradigm of Practice-Based Evidence has evolved with the key being evidence drawn from practice settings. Through the use of health information technology, electronic health records (EHR) capture relevant clinical practice “evidence”. A data-driven approach is proposed to capitalize on the benefits of EHR. The issues of data privacy, security and integrity are diminished by an information accountability concept. Data warehouse architecture completes the data-driven approach by integrating health data from multi-source systems, unique within the healthcare environment.
Resumo:
Big Datasets are endemic, but they are often notoriously difficult to analyse because of their size, heterogeneity, history and quality. The purpose of this paper is to open a discourse on the use of modern experimental design methods to analyse Big Data in order to answer particular questions of interest. By appealing to a range of examples, it is suggested that this perspective on Big Data modelling and analysis has wide generality and advantageous inferential and computational properties. In particular, the principled experimental design approach is shown to provide a flexible framework for analysis that, for certain classes of objectives and utility functions, delivers near equivalent answers compared with analyses of the full dataset under a controlled error rate. It can also provide a formalised method for iterative parameter estimation, model checking, identification of data gaps and evaluation of data quality. Finally, it has the potential to add value to other Big Data sampling algorithms, in particular divide-and-conquer strategies, by determining efficient sub-samples.
Resumo:
The explosive growth in the development of Traditional Chinese Medicine (TCM) has resulted in the continued increase in clinical and research data. The lack of standardised terminology, flaws in data quality planning and management of TCM informatics are preventing clinical decision-making, drug discovery and education. This paper argues that the introduction of data warehousing technologies to enhance the effectiveness and durability in TCM is paramount. To showcase the role of data warehousing in the improvement of TCM, this paper presents a practical model for data warehousing with detailed explanation, which is based on the structured electronic records, for TCM clinical researches and medical knowledge discovery.
Resumo:
Over recent years, the focus in road safety has shifted towards a greater understanding of road crash serious injuries in addition to fatalities. Police reported crash data are often the primary source of crash information; however, the definition of serious injury within these data is not consistent across jurisdictions and may not be accurately operationalised. This study examined the linkage of police-reported road crash data with hospital data to explore the potential for linked data to enhance the quantification of serious injury. Data from the Queensland Road Crash Database (QRCD), the Queensland Hospital Admitted Patients Data Collection (QHAPDC), Emergency Department Information System (EDIS), and the Queensland Injury Surveillance Unit (QISU) for the year 2009 were linked. Nine different estimates of serious road crash injury were produced. Results showed that there was a large amount of variation in the estimates of the number and profile of serious road crash injuries depending on the definition or measure used. The results also showed that as the definition of serious injury becomes more precise the vulnerable road users become more prominent. These results have major implications in terms of how serious injuries are identified for reporting purposes. Depending on the definitions used, the calculation of cost and understanding of the impact of serious injuries would vary greatly. This study has shown how data linkage can be used to investigate issues of data quality. It has also demonstrated the potential improvements to the understanding of the road safety problem, particularly serious injury, by conducting data linkage.
Resumo:
This study identified the areas of poor specificity in national injury hospitalization data and the areas of improvement and deterioration in specificity over time. A descriptive analysis of ten years of national hospital discharge data for Australia from July 2002-June 2012 was performed. Proportions and percentage change of defined/undefined codes over time was examined. At the intent block level, accidents and assault were the most poorly defined with over 11% undefined in each block. The mechanism blocks for accidents showed a significant deterioration in specificity over time with up to 20% more undefined codes in some mechanisms. Place and activity were poorly defined at the broad block level (43% and 72% undefined respectively). Private hospitals and hospitals in very remote locations recorded the highest proportion of undefined codes. Those aged over 60 years and females had the higher proportion of undefined code usage. This study has identified significant, and worsening, deficiencies in the specificity of coded injury data in several areas. Focal attention is needed to improve the quality of injury data, especially on those identified in this study, to provide the evidence base needed to address the significant burden of injury in the Australian community.