8 resultados para Data quality-aware mechanisms

em Aston University Research Archive


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Indicators which summarise the characteristics of spatiotemporal data coverages significantly simplify quality evaluation, decision making and justification processes by providing a number of quality cues that are easy to manage and avoiding information overflow. Criteria which are commonly prioritised in evaluating spatial data quality and assessing a dataset’s fitness for use include lineage, completeness, logical consistency, positional accuracy, temporal and attribute accuracy. However, user requirements may go far beyond these broadlyaccepted spatial quality metrics, to incorporate specific and complex factors which are less easily measured. This paper discusses the results of a study of high level user requirements in geospatial data selection and data quality evaluation. It reports on the geospatial data quality indicators which were identified as user priorities, and which can potentially be standardised to enable intercomparison of datasets against user requirements. We briefly describe the implications for tools and standards to support the communication and intercomparison of data quality, and the ways in which these can contribute to the generation of a GEO label.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Data quality is a difficult notion to define precisely, and different communities have different views and understandings of the subject. This causes confusion, a lack of harmonization of data across communities and omission of vital quality information. For some existing data infrastructures, data quality standards cannot address the problem adequately and cannot full all user needs or cover all concepts of data quality. In this paper we discuss some philosophical issues on data quality. We identify actual user needs on data quality, review existing standards and specification on data quality, and propose an integrated model for data quality in the eld of Earth observation. We also propose a practical mechanism for applying the integrated quality information model to large number of datasets through metadata inheritance. While our data quality management approach is in the domain of Earth observation, we believe the ideas and methodologies for data quality management can be applied to wider domains and disciplines to facilitate quality-enabled scientific research.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The accuracy of a map is dependent on the reference dataset used in its construction. Classification analyses used in thematic mapping can, for example, be sensitive to a range of sampling and data quality concerns. With particular focus on the latter, the effects of reference data quality on land cover classifications from airborne thematic mapper data are explored. Variations in sampling intensity and effort are highlighted in a dataset that is widely used in mapping and modelling studies; these may need accounting for in analyses. The quality of the labelling in the reference dataset was also a key variable influencing mapping accuracy. Accuracy varied with the amount and nature of mislabelled training cases with the nature of the effects varying between classifiers. The largest impacts on accuracy occurred when mislabelling involved confusion between similar classes. Accuracy was also typically negatively related to the magnitude of mislabelled cases and the support vector machine (SVM), which has been claimed to be relatively insensitive to training data error, was the most sensitive of the set of classifiers investigated, with overall classification accuracy declining by 8% (significant at 95% level of confidence) with the use of a training set containing 20% mislabelled cases.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The evaluation of geospatial data quality and trustworthiness presents a major challenge to geospatial data users when making a dataset selection decision. The research presented here therefore focused on defining and developing a GEO label – a decision support mechanism to assist data users in efficient and effective geospatial dataset selection on the basis of quality, trustworthiness and fitness for use. This thesis thus presents six phases of research and development conducted to: (a) identify the informational aspects upon which users rely when assessing geospatial dataset quality and trustworthiness; (2) elicit initial user views on the GEO label role in supporting dataset comparison and selection; (3) evaluate prototype label visualisations; (4) develop a Web service to support GEO label generation; (5) develop a prototype GEO label-based dataset discovery and intercomparison decision support tool; and (6) evaluate the prototype tool in a controlled human-subject study. The results of the studies revealed, and subsequently confirmed, eight geospatial data informational aspects that were considered important by users when evaluating geospatial dataset quality and trustworthiness, namely: producer information, producer comments, lineage information, compliance with standards, quantitative quality information, user feedback, expert reviews, and citations information. Following an iterative user-centred design (UCD) approach, it was established that the GEO label should visually summarise availability and allow interrogation of these key informational aspects. A Web service was developed to support generation of dynamic GEO label representations and integrated into a number of real-world GIS applications. The service was also utilised in the development of the GEO LINC tool – a GEO label-based dataset discovery and intercomparison decision support tool. The results of the final evaluation study indicated that (a) the GEO label effectively communicates the availability of dataset quality and trustworthiness information and (b) GEO LINC successfully facilitates ‘at a glance’ dataset intercomparison and fitness for purpose-based dataset selection.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this work we present a quality driven approach to DASH (Dynamic Adaptive Streaming over HTTP) for segment selection in varying network conditions. Current adaption algorithms focus largely on regulating data rates using network layer parameters by selecting the level of quality on offer that can eliminate buffer underrun without considering picture fidelity. In reality, viewers may accept a level of buffer underrun in order to achieve an improved level of picture fidelity. In this case, the conventional DASH algorithms can cause extreme degradation of the picture fidelity when attempting to eliminate buffer underrun with scarce bandwidth availability. Our work is concerned with a quality-aware rate adaption scheme that maximizes the client's quality of experience in terms of both continuity and fidelity (picture quality). Results show that the scheme proposed can maintain a high level of quality for streaming services, especially at low packet loss rates. It is also shown that by eliminating buffer underrun completely, the PSNR that reflects the picture quality of the video is greatly reduced. Our scheme offers the offset between continuity-based quality and resolution-based quality, which can be used to set threshold values for the level of quality desired by clients with different quality requirements. © 2013 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Substantial altimetry datasets collected by different satellites have only become available during the past five years, but the future will bring a variety of new altimetry missions, both parallel and consecutive in time. The characteristics of each produced dataset vary with the different orbital heights and inclinations of the spacecraft, as well as with the technical properties of the radar instrument. An integral analysis of datasets with different properties offers advantages both in terms of data quantity and data quality. This thesis is concerned with the development of the means for such integral analysis, in particular for dynamic solutions in which precise orbits for the satellites are computed simultaneously. The first half of the thesis discusses the theory and numerical implementation of dynamic multi-satellite altimetry analysis. The most important aspect of this analysis is the application of dual satellite altimetry crossover points as a bi-directional tracking data type in simultaneous orbit solutions. The central problem is that the spatial and temporal distributions of the crossovers are in conflict with the time-organised nature of traditional solution methods. Their application to the adjustment of the orbits of both satellites involved in a dual crossover therefore requires several fundamental changes of the classical least-squares prediction/correction methods. The second part of the thesis applies the developed numerical techniques to the problems of precise orbit computation and gravity field adjustment, using the altimetry datasets of ERS-1 and TOPEX/Poseidon. Although the two datasets can be considered less compatible that those of planned future satellite missions, the obtained results adequately illustrate the merits of a simultaneous solution technique. In particular, the geographically correlated orbit error is partially observable from a dataset consisting of crossover differences between two sufficiently different altimetry datasets, while being unobservable from the analysis of altimetry data of both satellites individually. This error signal, which has a substantial gravity-induced component, can be employed advantageously in simultaneous solutions for the two satellites in which also the harmonic coefficients of the gravity field model are estimated.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Geospatial data have become a crucial input for the scientific community for understanding the environment and developing environmental management policies. The Global Earth Observation System of Systems (GEOSS) Clearinghouse is a catalogue and search engine that provides access to the Earth Observation metadata. However, metadata are often not easily understood by users, especially when presented in ISO XML encoding. Data quality included in the metadata is basic for users to select datasets suitable for them. This work aims to help users to understand the quality information held in metadata records and to provide the results to geospatial users in an understandable and comparable way. Thus, we have developed an enhanced tool (Rubric-Q) for visually assessing the metadata quality information and quantifying the degree of metadata population. Rubric-Q is an extension of a previous NOAA Rubric tool used as a metadata training and improvement instrument. The paper also presents a thorough assessment of the quality information by applying the Rubric-Q to all dataset metadata records available in the GEOSS Clearinghouse. The results reveal that just 8.7% of the datasets have some quality element described in the metadata, 63.4% have some lineage element documented, and merely 1.2% has some usage element described. © 2013 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Ageing is accompanied by many visible characteristics. Other biological and physiological markers are also well-described e.g. loss of circulating sex hormones and increased inflammatory cytokines. Biomarkers for healthy ageing studies are presently predicated on existing knowledge of ageing traits. The increasing availability of data-intensive methods enables deep-analysis of biological samples for novel biomarkers. We have adopted two discrete approaches in MARK-AGE Work Package 7 for biomarker discovery; (1) microarray analyses and/or proteomics in cell systems e.g. endothelial progenitor cells or T cell ageing including a stress model; and (2) investigation of cellular material and plasma directly from tightly-defined proband subsets of different ages using proteomic, transcriptomic and miR array. The first approach provided longitudinal insight into endothelial progenitor and T cell ageing.This review describes the strategy and use of hypothesis-free, data-intensive approaches to explore cellular proteins, miR, mRNA and plasma proteins as healthy ageing biomarkers, using ageing models and directly within samples from adults of different ages. It considers the challenges associated with integrating multiple models and pilot studies as rational biomarkers for a large cohort study. From this approach, a number of high-throughput methods were developed to evaluate novel, putative biomarkers of ageing in the MARK-AGE cohort.