953 resultados para DATA QUALITY
Resumo:
An array of Bio-Argo floats equipped with radiometric sensors has been recently deployed in various open ocean areas representative of the diversity of trophic and bio-optical conditions prevailing in the so-called Case 1 waters. Around solar noon and almost everyday, each float acquires 0-250 m vertical profiles of Photosynthetically Available Radiation and downward irradiance at three wavelengths (380, 412 and 490 nm). Up until now, more than 6500 profiles for each radiometric channel have been acquired. As these radiometric data are collected out of operator’s control and regardless of meteorological conditions, specific and automatic data processing protocols have to be developed. Here, we present a data quality-control procedure aimed at verifying profile shapes and providing near real-time data distribution. This procedure is specifically developed to: 1) identify main issues of measurements (i.e. dark signal, atmospheric clouds, spikes and wave-focusing occurrences); 2) validate the final data with a hierarchy of tests to ensure a scientific utilization. The procedure, adapted to each of the four radiometric channels, is designed to flag each profile in a way compliant with the data management procedure used by the Argo program. Main perturbations in the light field are identified by the new protocols with good performances over the whole dataset. This highlights its potential applicability at the global scale. Finally, the comparison with modeled surface irradiances allows assessing the accuracy of quality-controlled measured irradiance values and identifying any possible evolution over the float lifetime due to biofouling and instrumental drift.
Resumo:
The speed with which data has moved from being scarce, expensive and valuable, thus justifying detailed and careful verification and analysis to a situation where the streams of detailed data are almost too large to handle has caused a series of shifts to occur. Legal systems already have severe problems keeping up with, or even in touch with, the rate at which unexpected outcomes flow from information technology. The capacity to harness massive quantities of existing data has driven Big Data applications until recently. Now the data flows in real time are rising swiftly, become more invasive and offer monitoring potential that is eagerly sought by commerce and government alike. The ambiguities as to who own this often quite remarkably intrusive personal data need to be resolved – and rapidly - but are likely to encounter rising resistance from industrial and commercial bodies who see this data flow as ‘theirs’. There have been many changes in ICT that has led to stresses in the resolution of the conflicts between IP exploiters and their customers, but this one is of a different scale due to the wide potential for individual customisation of pricing, identification and the rising commercial value of integrated streams of diverse personal data. A new reconciliation between the parties involved is needed. New business models, and a shift in the current confusions over who owns what data into alignments that are in better accord with the community expectations. After all they are the customers, and the emergence of information monopolies needs to be balanced by appropriate consumer/subject rights. This will be a difficult discussion, but one that is needed to realise the great benefits to all that are clearly available if these issues can be positively resolved. The customers need to make these data flow contestable in some form. These Big data flows are only going to grow and become ever more instructive. A better balance is necessary, For the first time these changes are directly affecting governance of democracies, as the very effective micro targeting tools deployed in recent elections have shown. Yet the data gathered is not available to the subjects. This is not a survivable social model. The Private Data Commons needs our help. Businesses and governments exploit big data without regard for issues of legality, data quality, disparate data meanings, and process quality. This often results in poor decisions, with individuals bearing the greatest risk. The threats harbored by big data extend far beyond the individual, however, and call for new legal structures, business processes, and concepts such as a Private Data Commons. This Web extra is the audio part of a video in which author Marcus Wigan expands on his article "Big Data's Big Unintended Consequences" and discusses how businesses and governments exploit big data without regard for issues of legality, data quality, disparate data meanings, and process quality. This often results in poor decisions, with individuals bearing the greatest risk. The threats harbored by big data extend far beyond the individual, however, and call for new legal structures, business processes, and concepts such as a Private Data Commons.
Resumo:
This document does NOT address the issue of oxygen data quality control (either real-time or delayed mode). As a preliminary step towards that goal, this document seeks to ensure that all countries deploying floats equipped with oxygen sensors document the data and metadata related to these floats properly. We produced this document in response to action item 14 from the AST-10 meeting in Hangzhou (March 22-23, 2009). Action item 14: Denis Gilbert to work with Taiyo Kobayashi and Virginie Thierry to ensure DACs are processing oxygen data according to recommendations. If the recommendations contained herein are followed, we will end up with a more uniform set of oxygen data within the Argo data system, allowing users to begin analysing not only their own oxygen data, but also those of others, in the true spirit of Argo data sharing. Indications provided in this document are valid as of the date of writing this document. It is very likely that changes in sensors, calibrations and conversions equations will occur in the future. Please contact V. Thierry (vthierry@ifremer.fr) for any inconsistencies or missing information. A dedicated webpage on the Argo Data Management website (www) contains all information regarding Argo oxygen data management : current and previous version of this cookbook, oxygen sensor manuals, calibration sheet examples, examples of matlab code to process oxygen data, test data, etc..
Resumo:
Ensemble Stream Modeling and Data-cleaning are sensor information processing systems have different training and testing methods by which their goals are cross-validated. This research examines a mechanism, which seeks to extract novel patterns by generating ensembles from data. The main goal of label-less stream processing is to process the sensed events to eliminate the noises that are uncorrelated, and choose the most likely model without over fitting thus obtaining higher model confidence. Higher quality streams can be realized by combining many short streams into an ensemble which has the desired quality. The framework for the investigation is an existing data mining tool. First, to accommodate feature extraction such as a bush or natural forest-fire event we make an assumption of the burnt area (BA*), sensed ground truth as our target variable obtained from logs. Even though this is an obvious model choice the results are disappointing. The reasons for this are two: One, the histogram of fire activity is highly skewed. Two, the measured sensor parameters are highly correlated. Since using non descriptive features does not yield good results, we resort to temporal features. By doing so we carefully eliminate the averaging effects; the resulting histogram is more satisfactory and conceptual knowledge is learned from sensor streams. Second is the process of feature induction by cross-validating attributes with single or multi-target variables to minimize training error. We use F-measure score, which combines precision and accuracy to determine the false alarm rate of fire events. The multi-target data-cleaning trees use information purity of the target leaf-nodes to learn higher order features. A sensitive variance measure such as f-test is performed during each node’s split to select the best attribute. Ensemble stream model approach proved to improve when using complicated features with a simpler tree classifier. The ensemble framework for data-cleaning and the enhancements to quantify quality of fitness (30% spatial, 10% temporal, and 90% mobility reduction) of sensor led to the formation of streams for sensor-enabled applications. Which further motivates the novelty of stream quality labeling and its importance in solving vast amounts of real-time mobile streams generated today.
Resumo:
Recent marine long-offset transient electromagnetic (LOTEM) measurements yielded the offshore delineation of a fresh groundwater body beneath the seafloor in the region of Bat Yam, Israel. The LOTEM application was effective in detecting this freshwater body underneath the Mediterranean Sea and allowed an estimation of its seaward extent. However, the measured data set was insufficient to understand the hydrogeological configuration and mechanism controlling the occurrence of this fresh groundwater discovery. Especially the lateral geometry of the freshwater boundary, important for the hydrogeological modelling, could not be resolved. Without such an understanding, a rational management of this unexploited groundwater reservoir is not possible. Two new high-resolution marine time-domain electromagnetic methods are theoretically developed to derive the hydrogeological structure of the western aquifer boundary. The first is called Circular Electric Dipole (CED). It is the land-based analogous of the Vertical Electric Dipole (VED), which is commonly applied to detect resistive structures in the subsurface. Although the CED shows exceptional detectability characteristics in the step-off signal towards the sub-seafloor freshwater body, an actual application was not carried out in the extent of this study. It was found that the method suffers from an insufficient signal strength to adequately delineate the resistive aquifer under realistic noise conditions. Moreover, modelling studies demonstrated that severe signal distortions are caused by the slightest geometrical inaccuracies. As a result, a successful application of CED in Israel proved to be rather doubtful. A second method called Differential Electric Dipole (DED) is developed as an alternative to the intended CED method. Compared to the conventional marine time-domain electromagnetic system that commonly applies a horizontal electric dipole transmitter, the DED is composed of two horizontal electric dipoles in an in-line configuration that share a common central electrode. Theoretically, DED has similar detectability/resolution characteristics compared to the conventional LOTEM system. However, the superior lateral resolution towards multi-dimensional resistivity structures make an application desirable. Furthermore, the method is less susceptible towards geometrical errors making an application in Israel feasible. In the extent of this thesis, the novel marine DED method is substantiated using several one-dimensional (1D) and multi-dimensional (2D/3D) modelling studies. The main emphasis lies on the application in Israel. Preliminary resistivity models are derived from the previous marine LOTEM measurement and tested for a DED application. The DED method is effective in locating the two-dimensional resistivity structure at the western aquifer boundary. Moreover, a prediction regarding the hydrogeological boundary conditions are feasible, provided a brackish water zone exists at the head of the interface. A seafloor-based DED transmitter/receiver system is designed and built at the Institute of Geophysics and Meteorology at the University of Cologne. The first DED measurements were carried out in Israel in April 2016. The acquired data set is the first of its kind. The measured data is processed and subsequently interpreted using 1D inversion. The intended aim of interpreting both step-on and step-off signals failed, due to the insufficient data quality of the latter. Yet, the 1D inversion models of the DED step-on signals clearly detect the freshwater body for receivers located close to the Israeli coast. Additionally, a lateral resistivity contrast is observable in the 1D inversion models that allow to constrain the seaward extent of this freshwater body. A large-scale 2D modelling study followed the 1D interpretation. In total, 425 600 forward calculations are conducted to find a sub-seafloor resistivity distribution that adequately explains the measured data. The results indicate that the western aquifer boundary is located at 3600 m - 3700 m before the coast. Moreover, a brackish water zone of 3 Omega*m to 5 Omega*m with a lateral extent of less than 300 m is likely located at the head of the freshwater aquifer. Based on these results, it is predicted that the sub-seafloor freshwater body is indeed open to the sea and may be vulnerable to seawater intrusion.
Resumo:
Collecting ground truth data is an important step to be accomplished before performing a supervised classification. However, its quality depends on human, financial and time ressources. It is then important to apply a validation process to assess the reliability of the acquired data. In this study, agricultural infomation was collected in the Brazilian Amazonian State of Mato Grosso in order to map crop expansion based on MODIS EVI temporal profiles. The field work was carried out through interviews for the years 2005-2006 and 2006-2007. This work presents a methodology to validate the training data quality and determine the optimal sample to be used according to the classifier employed. The technique is based on the detection of outlier pixels for each class and is carried out by computing Mahalanobis distances for each pixel. The higher the distance, the further the pixel is from the class centre. Preliminary observations through variation coefficent validate the efficiency of the technique to detect outliers. Then, various subsamples are defined by applying different thresholds to exclude outlier pixels from the classification process. The classification results prove the robustness of the Maximum Likelihood and Spectral Angle Mapper classifiers. Indeed, those classifiers were insensitive to outlier exclusion. On the contrary, the decision tree classifier showed better results when deleting 7.5% of pixels in the training data. The technique managed to detect outliers for all classes. In this study, few outliers were present in the training data, so that the classification quality was not deeply affected by the outliers.
Resumo:
Objective: To assess extent of coder agreement for external causes of injury using ICD-10-AM for injury-related hospitalisations in Australian public hospitals. Methods: A random sample of 4850 discharges from 2002 to 2004 was obtained from a stratified random sample of 50 hospitals across four states in Australia. On-site medical record reviews were conducted and external cause codes were assigned blinded to the original coded data. Code agreement levels were grouped into the following agreement categories: block level, 3-character level, 4-character level, 5th-character level, and complete code level. Results: At a broad block level, code agreement was found in over 90% of cases for most mechanisms (eg, transport, fall). Percentage disagreement was 26.0% at the 3-character level; agreement for the complete external cause code was 67.6%. For activity codes, the percentage of disagreement at the 3-character level was 7.3% and agreement for the complete activity code was 68.0%. For place of occurrence codes, the percentage of disagreement at the 4-character level was 22.0%; agreement for the complete place code was 75.4%. Conclusions: With 68% agreement for complete codes and 74% agreement for 3-character codes, as well as variability in agreement levels across different code blocks, place and activity codes, researchers need to be aware of the reliability of their specific data of interest when they wish to undertake trend analyses or case selection for specific causes of interest.
Resumo:
Catheter-related bloodstream infections are a serious problem. Many interventions reduce risk, and some have been evaluated in cost-effectiveness studies. We review the usefulness and quality of these economic studies. Evidence is incomplete, and data required to inform a coherent policy are missing. The cost-effectiveness studies are characterized by a lack of transparency, short time-horizons, and narrow economic perspectives. Data quality is low for some important model parameters. Authors of future economic evaluations should aim to model the complete policy and not just single interventions. They should be rigorous in developing the structure of the economic model, include all relevant economic outcomes, use a systematic approach for selecting data sources for model parameters, and propagate the effect of uncertainty in model parameters on conclusions. This will inform future data collection and improve our understanding of the economics of preventing these infections.
Resumo:
“SOH see significant benefit in digitising its drawings and operation and maintenance manuals. Since SOH do not currently have digital models of the Opera House structure or other components, there is an opportunity for this national case study to promote the application of Digital Facility Modelling using standardized Building Information Models (BIM)”. The digital modelling element of this project examined the potential of building information models for Facility Management focusing on the following areas: • The re-usability of building information for FM purposes • BIM as an Integrated information model for facility management • Extendibility of the BIM to cope with business specific requirements • Commercial facility management software using standardised building information models • The ability to add (organisation specific) intelligence to the model • A roadmap for SOH to adopt BIM for FM The project has established that BIM – building information modelling - is an appropriate and potentially beneficial technology for the storage of integrated building, maintenance and management data for SOH. Based on the attributes of a BIM, several advantages can be envisioned: consistency in the data, intelligence in the model, multiple representations, source of information for intelligent programs and intelligent queries. The IFC – open building exchange standard – specification provides comprehensive support for asset and facility management functions, and offers new management, collaboration and procurement relationships based on sharing of intelligent building data. The major advantages of using an open standard are: information can be read and manipulated by any compliant software, reduced user “lock in” to proprietary solutions, third party software can be the “best of breed” to suit the process and scope at hand, standardised BIM solutions consider the wider implications of information exchange outside the scope of any particular vendor, information can be archived as ASCII files for archival purposes, and data quality can be enhanced as the now single source of users’ information has improved accuracy, correctness, currency, completeness and relevance. SOH current building standards have been successfully drafted for a BIM environment and are confidently expected to be fully developed when BIM is adopted operationally by SOH. There have been remarkably few technical difficulties in converting the House’s existing conventions and standards to the new model based environment. This demonstrates that the IFC model represents world practice for building data representation and management (see Sydney Opera House – FM Exemplar Project Report Number 2005-001-C-3, Open Specification for BIM: Sydney Opera House Case Study). Availability of FM applications based on BIM is in its infancy but focussed systems are already in operation internationally and show excellent prospects for implementation systems at SOH. In addition to the generic benefits of standardised BIM described above, the following FM specific advantages can be expected from this new integrated facilities management environment: faster and more effective processes, controlled whole life costs and environmental data, better customer service, common operational picture for current and strategic planning, visual decision-making and a total ownership cost model. Tests with partial BIM data – provided by several of SOH’s current consultants – show that the creation of a SOH complete model is realistic, but subject to resolution of compliance and detailed functional support by participating software applications. The showcase has demonstrated successfully that IFC based exchange is possible with several common BIM based applications through the creation of a new partial model of the building. Data exchanged has been geometrically accurate (the SOH building structure represents some of the most complex building elements) and supports rich information describing the types of objects, with their properties and relationships.
Resumo:
Introduction: Some types of antimicrobial-coated central venous catheters (A-CVC) have been shown to be cost-effective in preventing catheter-related bloodstream infection (CR-BSI). However, not all types have been evaluated, and there are concerns over the quality and usefulness of these earlier studies. There is uncertainty amongst clinicians over which, if any, antimicrobial-coated central venous catheters to use. We re-evaluated the cost-effectiveness of all commercially available antimicrobialcoated central venous catheters for prevention of catheter-related bloodstream infection in adult intensive care unit (ICU) patients. Methods: We used a Markov decision model to compare the cost-effectiveness of antimicrobial-coated central venous catheters relative to uncoated catheters. Four catheter types were evaluated; minocycline and rifampicin (MR)-coated catheters; silver, platinum and carbon (SPC)-impregnated catheters; and two chlorhexidine and silver sulfadiazine-coated catheters, one coated on the external surface (CH/SSD (ext)) and the other coated on both surfaces (CH/SSD (int/ext)). The incremental cost per qualityadjusted life-year gained and the expected net monetary benefits were estimated for each. Uncertainty arising from data estimates, data quality and heterogeneity was explored in sensitivity analyses. Results: The baseline analysis, with no consideration of uncertainty, indicated all four types of antimicrobial-coated central venous catheters were cost-saving relative to uncoated catheters. Minocycline and rifampicin-coated catheters prevented 15 infections per 1,000 catheters and generated the greatest health benefits, 1.6 quality-adjusted life-years, and cost-savings, AUD $130,289. After considering uncertainty in the current evidence, the minocycline and rifampicin-coated catheters returned the highest incremental monetary net benefits of $948 per catheter; but there was a 62% probability of error in this conclusion. Although the minocycline and rifampicin-coated catheters had the highest monetary net benefits across multiple scenarios, the decision was always associated with high uncertainty. Conclusions: Current evidence suggests that the cost-effectiveness of using antimicrobial-coated central venous catheters within the ICU is highly uncertain. Policies to prevent catheter-related bloodstream infection amongst ICU patients should consider the cost-effectiveness of competing interventions in the light of this uncertainty. Decision makers would do well to consider the current gaps in knowledge and the complexity of producing good quality evidence in this area.
Resumo:
Nature Refuges encompass the second largest extent of protected area estate in Queensland. Major problems exist in the data capture, map presentation, data quality and integrity of these boundaries. The spatial accuracies/inaccuracies of the Nature Refuge administrative boundaries directly influence the ability to preserve valuable ecosystems by challenging negative environmental impacts on these properties. This research work is about supporting the Nature Refuge Programs efforts to secure Queensland’s natural and cultural values on private land by utilising GIS and its advanced functionalities. The research design organizes and enters Queensland’s Nature Refuge boundaries into a spatial environment. Survey quality data collection techniques such as the Global Positioning Systems (GPS) are investigated to capture Nature Refuge boundary information. Using the concepts of map communication GIS Cartography is utilised for the protected area plan design. New spatial datasets are generated facilitating the effectiveness of investigative data analysis. The geodatabase model developed by this study adds rich GIS behaviour providing the capability to store, query, and manipulate geographic information. It provides the ability to leverage data relationships and enforces topological integrity creating savings in customization and productivity. The final phase of the research design incorporates the advanced functions of ArcGIS. These functions facilitate building spatial system models. The geodatabase and process models developed by this research can be easily modified and the data relating to mining can be replaced by other negative environmental impacts affecting the Nature Refuges. Results of the research are presented as graphs and maps providing visual evidence supporting the usefulness of GIS as means for capturing, visualising and enhancing spatial quality and integrity of Nature Refuge boundaries.
Resumo:
The following paper proposes a novel application of Skid-to-Turn maneuvers for fixed wing Unmanned Aerial Vehicles (UAVs) inspecting locally linear infrastructure. Fixed wing UAVs, following the design of manned aircraft, commonly employ Bank-to-Turn ma- neuvers to change heading and thus direction of travel. Whilst effective, banking an aircraft during the inspection of ground based features hinders data collection, with body fixed sen- sors angled away from the direction of turn and a panning motion induced through roll rate that can reduce data quality. By adopting Skid-to-Turn maneuvers, the aircraft can change heading whilst maintaining wings level flight, thus allowing body fixed sensors to main- tain a downward facing orientation. An Image-Based Visual Servo controller is developed to directly control the position of features as captured by onboard inspection sensors. This improves on the indirect approach taken by other tracking controllers where a course over ground directly above the feature is assumed to capture it centered in the field of view. Performance of the proposed controller is compared against that of a Bank-to-Turn tracking controller driven by GPS derived cross track error in a simulation environment developed to replicate the field of view of a body fixed camera.
Resumo:
Post license advanced driver training programs in the US and early programs in Europe have often failed to accomplish their stated objectives because, it is suspected, that drivers gain self perceived driving skills that exceed their true skills—leading to increased post training crashes. The consensus from the evaluation of countless advanced driver training programs is that these programs are a detriment to safety, especially for novice, young, male drivers. Some European countries including Sweden, Finland, Austria, Luxembourg, and Norway, have continued to refine these programs, with an entirely new training philosophy emerging around 1990. These ‘post-renewal’ programs have shown considerable promise, despite various data quality and availability concerns. These programs share in common a focus on teaching drivers about self assessment and anticipation of risk, as opposed to teaching drivers how to master driving at the limits of tire adhesion. The programs focus on factors such as self actualization and driving discipline, rather than low level mastery of skills. Drivers are meant to depart these renewed programs with a more realistic assessment of their driving abilities. These renewed programs require considerable specialized and costly infrastructure including dedicated driver training facilities with driving modules engineered specifically for advanced driver training and highly structured curricula. They are conspicuously missing from both the US road safety toolbox and academic literature. Given the considerable road safety concerns associated with US novice male drivers in particular, these programs warrant further attention. This paper reviews the predominant features and empirical evidence surrounding post licensing advanced driver training programs focused on novice drivers. A clear articulation of differences between the renewed and current US advanced driver training programs is provided. While the individual quantitative evaluations range from marginally to significantly effective in reducing novice driver crash risk, they have been criticized for evaluation deficiencies ranging from small sample sizes to confounding variables to lack of exposure metrics. Collectively, however, the programs sited in the paper suggest at least a marginally positive effect that needs to be validated with further studies. If additional well controlled studies can validate these programs, a pilot program in the US should be considered.
Resumo:
There has been an increasing interest by governments worldwide in the potential benefits of open access to public sector information (PSI). However, an important question remains: can a government incur tortious liability for incorrect information released online under an open content licence? This paper argues that the release of PSI online for free under an open content licence, specifically a Creative Commons licence, is within the bounds of an acceptable level of risk to government, especially where users are informed of the limitations of the data and appropriate information management policies and principles are in place to ensure accountability for data quality and accuracy.
Resumo:
The impact of urban development and climate change has created the impetus to monitor changes in the environment, particularly, the behaviour, habitat and movement of fauna species. The aim of this chapter is to present the design and development of a sensor network based on smart phones to automatically collect and analyse acoustic and visual data for environmental monitoring purposes. Due to the communication and sophisticated programming facilities offered by smart phones, software tools can be developed to allow data to be collected, partially processed and sent to a remote server over the network for storage and further processing. This sensor network which employs a client-server architecture has been deployed in three applications: monitoring a rare bird species near Brisbane Airport, study of koalas behaviour at St Bees Island, and detection of fruit flies. The users of this system include scientists (e.g. ecologists, ornithologists, computer scientists) and community groups participating in data collection or reporting on the environment (e.g. students, bird watchers). The chapter focuses on the following aspects of our research: issues involved in using smart phones as sensors; the overall framework for data acquisition, data quality control, data management and analysis; current and future applications of the smart phone-based sensor network, and our future research directions.