878 resultados para data validation


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Technology-supported citizen science has created huge volumes of data with increasing potential to facilitate scientific progress, however, verifying data quality is still a substantial hurdle due to the limitations of existing data quality mechanisms. In this study, we adopted a mixed methods approach to investigate community-based data validation practices and the characteristics of records of wildlife species observations that affected the outcomes of collaborative data quality management in an online community where people record what they see in the nature. The findings describe the processes that both relied upon and added to information provenance through information stewardship behaviors, which led to improved reliability and informativity. The likelihood of community-based validation interactions were predicted by several factors, including the types of organisms observed and whether the data were submitted from a mobile device. We conclude with implications for technology design, citizen science practices, and research.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We describe a new methodology for comparing satellite radiation budget data with a numerical weather prediction (NWP) model. This is applied to data from the Geostationary Earth Radiation Budget (GERB) instrument on Meteosat-8. The methodology brings together, in near-real time, GERB broadband shortwave and longwave fluxes with simulations based on analyses produced by the Met Office global NWP model. Results for the period May 2003 to February 2005 illustrate the progressive improvements in the data products as various initial problems were resolved. In most areas the comparisons reveal systematic errors in the model's representation of surface properties and clouds, which are discussed elsewhere. However, for clear-sky regions over the oceans the model simulations are believed to be sufficiently accurate to allow the quality of the GERB fluxes themselves to be assessed and any changes in time of the performance of the instrument to be identified. Using model and radiosonde profiles of temperature and humidity as input to a single-column version of the model's radiation code, we conduct sensitivity experiments which provide estimates of the expected model errors over the ocean of about ±5–10 W m−2 in clear-sky outgoing longwave radiation (OLR) and ±0.01 in clear-sky albedo. For the more recent data the differences between the observed and modeled OLR and albedo are well within these error estimates. The close agreement between the observed and modeled values, particularly for the most recent period, illustrates the value of the methodology. It also contributes to the validation of the GERB products and increases confidence in the quality of the data, prior to their release.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Postmortem computed tomography (pmCT) is increasingly applied in forensic medicine as a documentation and diagnostic tool. The present study investigated if pmCT data can be used to estimate the corpse weight. In 50 forensic cases, pmCT examinations were performed prior to autopsy and the pmCT data were used to determine the body volume using an automated segmentation tool. PmCT was performed within 48 h postmortem. The body weights assessed prior to autopsy and the body volumes assessed using the pmCT data were used to calculate individual multiplication factors. The mean postmortem multiplication factor for the study cases was 1.07 g/ml. Using this factor, the body weight may be estimated retrospectively when necessary. Severe artifact causing foreign bodies within the corpses limit the use of pmCT data for body weight estimations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With the advent of peer to peer networks, and more importantly sensor networks, the desire to extract useful information from continuous and unbounded streams of data has become more prominent. For example, in tele-health applications, sensor based data streaming systems are used to continuously and accurately monitor Alzheimer's patients and their surrounding environment. Typically, the requirements of such applications necessitate the cleaning and filtering of continuous, corrupted and incomplete data streams gathered wirelessly in dynamically varying conditions. Yet, existing data stream cleaning and filtering schemes are incapable of capturing the dynamics of the environment while simultaneously suppressing the losses and corruption introduced by uncertain environmental, hardware, and network conditions. Consequently, existing data cleaning and filtering paradigms are being challenged. This dissertation develops novel schemes for cleaning data streams received from a wireless sensor network operating under non-linear and dynamically varying conditions. The study establishes a paradigm for validating spatio-temporal associations among data sources to enhance data cleaning. To simplify the complexity of the validation process, the developed solution maps the requirements of the application on a geometrical space and identifies the potential sensor nodes of interest. Additionally, this dissertation models a wireless sensor network data reduction system by ascertaining that segregating data adaptation and prediction processes will augment the data reduction rates. The schemes presented in this study are evaluated using simulation and information theory concepts. The results demonstrate that dynamic conditions of the environment are better managed when validation is used for data cleaning. They also show that when a fast convergent adaptation process is deployed, data reduction rates are significantly improved. Targeted applications of the developed methodology include machine health monitoring, tele-health, environment and habitat monitoring, intermodal transportation and homeland security.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With the advent of peer to peer networks, and more importantly sensor networks, the desire to extract useful information from continuous and unbounded streams of data has become more prominent. For example, in tele-health applications, sensor based data streaming systems are used to continuously and accurately monitor Alzheimer's patients and their surrounding environment. Typically, the requirements of such applications necessitate the cleaning and filtering of continuous, corrupted and incomplete data streams gathered wirelessly in dynamically varying conditions. Yet, existing data stream cleaning and filtering schemes are incapable of capturing the dynamics of the environment while simultaneously suppressing the losses and corruption introduced by uncertain environmental, hardware, and network conditions. Consequently, existing data cleaning and filtering paradigms are being challenged. This dissertation develops novel schemes for cleaning data streams received from a wireless sensor network operating under non-linear and dynamically varying conditions. The study establishes a paradigm for validating spatio-temporal associations among data sources to enhance data cleaning. To simplify the complexity of the validation process, the developed solution maps the requirements of the application on a geometrical space and identifies the potential sensor nodes of interest. Additionally, this dissertation models a wireless sensor network data reduction system by ascertaining that segregating data adaptation and prediction processes will augment the data reduction rates. The schemes presented in this study are evaluated using simulation and information theory concepts. The results demonstrate that dynamic conditions of the environment are better managed when validation is used for data cleaning. They also show that when a fast convergent adaptation process is deployed, data reduction rates are significantly improved. Targeted applications of the developed methodology include machine health monitoring, tele-health, environment and habitat monitoring, intermodal transportation and homeland security.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Thèse numérisée par la Direction des bibliothèques de l'Université de Montréal.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Thèse numérisée par la Direction des bibliothèques de l'Université de Montréal.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A complete workflow specification requires careful integration of many different process characteristics. Decisions must be made as to the definitions of individual activities, their scope, the order of execution that maintains the overall business process logic, the rules governing the discipline of work list scheduling to performers, identification of time constraints and more. The goal of this paper is to address an important issue in workflows modelling and specification, which is data flow, its modelling, specification and validation. Researchers have neglected this dimension of process analysis for some time, mainly focussing on structural considerations with limited verification checks. In this paper, we identify and justify the importance of data modelling in overall workflows specification and verification. We illustrate and define several potential data flow problems that, if not detected prior to workflow deployment may prevent the process from correct execution, execute process on inconsistent data or even lead to process suspension. A discussion on essential requirements of the workflow data model in order to support data validation is also given..

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This dissertation develops the model of a prototype system for the digital lodgement of spatial data sets with statutory bodies responsible for the registration and approval of land related actions under the Torrens Title system. Spatial data pertain to the location of geographical entities together with their spatial dimensions and are classified as point, line, area or surface. This dissertation deals with a sub-set of spatial data, land boundary data that result from the activities performed by surveying and mapping organisations for the development of land parcels. The prototype system has been developed, utilising an event-driven paradigm for the user-interface, to exploit the potential of digital spatial data being generated from the utilisation of electronic techniques. The system provides for the creation of a digital model of the cadastral network and dependent data sets for an area of interest from hard copy records. This initial model is calibrated on registered control and updated by field survey to produce an amended model. The field-calibrated model then is electronically validated to ensure it complies with standards of format and content. The prototype system was designed specifically to create a database of land boundary data for subsequent retrieval by land professionals for surveying, mapping and related activities. Data extracted from this database are utilised for subsequent field survey operations without the need to create an initial digital model of an area of interest. Statistical reporting of differences resulting when subsequent initial and calibrated models are compared, replaces the traditional checking operations of spatial data performed by a land registry office. Digital lodgement of survey data is fundamental to the creation of the database of accurate land boundary data. This creation of the database is fundamental also to the efficient integration of accurate spatial data about land being generated by modem technology such as global positioning systems, and remote sensing and imaging, with land boundary information and other information held in Government databases. The prototype system developed provides for the delivery of accurate, digital land boundary data for the land registration process to ensure the continued maintenance of the integrity of the cadastre. Such data should meet also the more general and encompassing requirements of, and prove to be of tangible, longer term benefit to the developing, electronic land information industry.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The objective of this chapter is to provide an overview of traffic data collection that can and should be used for the calibration and validation of traffic simulation models. There are big differences in availability of data from different sources. Some types of data such as loop detector data are widely available and used. Some can be measured with additional effort, for example, travel time data from GPS probe vehicles. Some types such as trajectory data are available only in rare situations such as research projects.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Data quality (DQ) assessment can be significantly enhanced with the use of the right DQ assessment methods, which provide automated solutions to assess DQ. The range of DQ assessment methods is very broad: from data profiling and semantic profiling to data matching and data validation. This paper gives an overview of current methods for DQ assessment and classifies the DQ assessment methods into an existing taxonomy of DQ problems. Specific examples of the placement of each DQ method in the taxonomy are provided and illustrate why the method is relevant to the particular taxonomy position. The gaps in the taxonomy, where no current DQ methods exist, show where new methods are required and can guide future research and DQ tool development.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The contract work has demonstrated that older data can be assessed and entered into the MR format. Older data has associated problems but is retrievable. The contract successfully imported all datasets as required. MNCR survey sheets fit well into the MR format. The data validation and verification process can be improved. A number of computerised short cuts can be suggested and the process made more intuitive. Such a move is vital if MR is to be adopted as a standard by the recording community both on a voluntary level and potentially by consultancies.