957 resultados para Metadata
Resumo:
Traditionally, the formal scientific output in most fields of natural science has been limited to peer- reviewed academic journal publications, with less attention paid to the chain of intermediate data results and their associated metadata, including provenance. In effect, this has constrained the representation and verification of the data provenance to the confines of the related publications. Detailed knowledge of a dataset’s provenance is essential to establish the pedigree of the data for its effective re-use, and to avoid redundant re-enactment of the experiment or computation involved. It is increasingly important for open-access data to determine their authenticity and quality, especially considering the growing volumes of datasets appearing in the public domain. To address these issues, we present an approach that combines the Digital Object Identifier (DOI) – a widely adopted citation technique – with existing, widely adopted climate science data standards to formally publish detailed provenance of a climate research dataset as an associated scientific workflow. This is integrated with linked-data compliant data re-use standards (e.g. OAI-ORE) to enable a seamless link between a publication and the complete trail of lineage of the corresponding dataset, including the dataset itself.
Resumo:
Social tagging has become very popular around the Internet as well as in research. The main idea behind tagging is to allow users to provide metadata to the web content from their perspective to facilitate categorization and retrieval. There are many factors that influence users' tag choice. Many studies have been conducted to reveal these factors by analysing tagging data. This paper uses two theories to identify these factors, namely the semiotics theory and activity theory. The former treats tags as signs and the latter treats tagging as an activity. The paper uses both theories to analyse tagging behaviour by explaining all aspects of a tagging system, including tags, tagging system components and the tagging activity. The theoretical analysis produced a framework that was used to identify a number of factors. These factors can be considered as categories that can be consulted to redirect user tagging choice in order to support particular tagging behaviour, such as cross-lingual tagging.
Resumo:
There are three key components for developing a metadata system: a container structure laying out the key semantic issues of interest and their relationships; an extensible controlled vocabulary providing possible content; and tools to create and manipulate that content. While metadata systems must allow users to enter their own information, the use of a controlled vocabulary both imposes consistency of definition and ensures comparability of the objects described. Here we describe the controlled vocabulary (CV) and metadata creation tool built by the METAFOR project for use in the context of describing the climate models, simulations and experiments of the fifth Coupled Model Intercomparison Project (CMIP5). The CV and resulting tool chain introduced here is designed for extensibility and reuse and should find applicability in many more projects.
Resumo:
For users of climate services, the ability to quickly determine the datasets that best fit one's needs would be invaluable. The volume, variety and complexity of climate data makes this judgment difficult. The ambition of CHARMe ("Characterization of metadata to enable high-quality climate services") is to give a wider interdisciplinary community access to a range of supporting information, such as journal articles, technical reports or feedback on previous applications of the data. The capture and discovery of this "commentary" information, often created by data users rather than data providers, and currently not linked to the data themselves, has not been significantly addressed previously. CHARMe applies the principles of Linked Data and open web standards to associate, record, search and publish user-derived annotations in a way that can be read both by users and automated systems. Tools have been developed within the CHARMe project that enable annotation capability for data delivery systems already in wide use for discovering climate data. In addition, the project has developed advanced tools for exploring data and commentary in innovative ways, including an interactive data explorer and comparator ("CHARMe Maps") and a tool for correlating climate time series with external "significant events" (e.g. instrument failures or large volcanic eruptions) that affect the data quality. Although the project focuses on climate science, the concepts are general and could be applied to other fields. All CHARMe system software is open-source, released under a liberal licence, permitting future projects to re-use the source code as they wish.
Resumo:
Existing urban meteorological networks have an important role to play as test beds for inexpensive and more sustainable measurement techniques that are now becoming possible in our increasingly smart cities. The Birmingham Urban Climate Laboratory (BUCL) is a near-real-time, high-resolution urban meteorological network (UMN) of automatic weather stations and inexpensive, nonstandard air temperature sensors. The network has recently been implemented with an initial focus on monitoring urban heat, infrastructure, and health applications. A number of UMNs exist worldwide; however, BUCL is novel in its density, the low-cost nature of the sensors, and the use of proprietary Wi-Fi networks. This paper provides an overview of the logistical aspects of implementing a UMN test bed at such a density, including selecting appropriate urban sites; testing and calibrating low-cost, nonstandard equipment; implementing strict quality-assurance/quality-control mechanisms (including metadata); and utilizing preexisting Wi-Fi networks to transmit data. Also included are visualizations of data collected by the network, including data from the July 2013 U.K. heatwave as well as highlighting potential applications. The paper is an open invitation to use the facility as a test bed for evaluating models and/or other nonstandard observation techniques such as those generated via crowdsourcing techniques.
Resumo:
ISO19156 Observations and Measurements (O&M) provides a standardised framework for organising information about the collection of information about the environment. Here we describe the implementation of a specialisation of O&M for environmental data, the Metadata Objects for Linking Environmental Sciences (MOLES3). MOLES3 provides support for organising information about data, and for user navigation around data holdings. The implementation described here, “CEDA-MOLES”, also supports data management functions for the Centre for Environmental Data Archival, CEDA. The previous iteration of MOLES (MOLES2) saw active use over five years, being replaced by CEDA-MOLES in late 2014. During that period important lessons were learnt both about the information needed, as well as how to design and maintain the necessary information systems. In this paper we review the problems encountered in MOLES2; how and why CEDA-MOLES was developed and engineered; the migration of information holdings from MOLES2 to CEDA-MOLES; and, finally, provide an early assessment of MOLES3 (as implemented in CEDA-MOLES) and its limitations. Key drivers for the MOLES3 development included the necessity for improved data provenance, for further structured information to support ISO19115 discovery metadata export (for EU INSPIRE compliance), and to provide appropriate fixed landing pages for Digital Object Identifiers (DOIs) in the presence of evolving datasets. Key lessons learned included the importance of minimising information structure in free text fields, and the necessity to support as much agility in the information infrastructure as possible without compromising on maintainability both by those using the systems internally and externally (e.g. citing in to the information infrastructure), and those responsible for the systems themselves. The migration itself needed to ensure continuity of service and traceability of archived assets.
Resumo:
Banverket är den myndighet som har ansvaret för järnvägstransportsystemet i Sverige och är enproducent av järnvägsinformation. En stor del av infrastrukturinformationen lagras i baninformationssystemetBIS och Banverket är förpliktad att tillhandahålla den. I nuläget är informationenåtkomlig på ett lätt sätt främst internt. Dessutom ställer dagens samhälle krav på att informationenska levereras enligt standard. Standardiseringsorgan har tagit fram standarder för geografiskinformation som svar på behovet att återanvända och samutnyttja geografiska data.Vårt examensarbete är genomfört på Banverket Verksamhetsstöds IT-avdelning och var inriktatpå att kartlägga standarder för geografisk information och utveckla en webbtjänst för BIS för attöka tillgängligheten till informationen.Rapportens kunskapsbidrag är en analys av standarder för geografisk information som är relevantaför järnvägsnätet. Resultatet har framställts genom en reflexiv analysmetod med abduktionsom angreppssätt och kvalitativa datainsamlingsmetoder.Som en praktisk del i vårt examensarbete har vi utvecklat en webbtjänst som tillhandahållerfunktionalitet för att nå data i BIS om nätanknutna företeelser och ett ramverk för att omstruktureraBIS interna format i enlighet med standarder.Vi föreslår att Banverket följer de svenska tillämpningsstandarderna SS 63 70 04, SS 63 70 06och SS 63 70 07. Enligt vårt förslag ska Banverket vidta följande åtgärder för att leverera grundläggandedata om järnvägsnät och nätanknutna företeelser enligt dessa standarder:• Analys av informationsbehov utifrån nya verklighets- och verksamhetskrav.• Dokumentation av begreppsmodell och publicering av objekttypskatalog.• Kvalitetssäkring av informationen i BIS.• Datainsamling av koordinater på noder och geometrier på länkar.• Utveckling av ny datamodell för BIS som stödjer tidsaspekter.• Översyn av BIS arkitektur och design.• Uppgradering av utvecklingsplattform.• Utveckling av transformeringar för omstrukturering av rapportresultat från BIS interna formattill standarder.• Komplettering av webbtjänsten med metoder för att leverera metadata. Användaren behövermetadata för att kunna välja objekttyper, attribut och urvalskriterier när rapportbeställningarskapas.• Komplettering av webbtjänsten med metoder för att leverera information om järnvägsnätetsinfrastruktur.
Resumo:
The authors take a broad view that ultimately Grid- or Web-services must be located via personalised, semantic-rich discovery processes. They argue that such processes must rely on the storage of arbitrary metadata about services that originates from both service providers and service users. Examples of such metadata are reliability metrics, quality of service data, or semantic service description markup. This paper presents UDDI-MT, an extension to the standard UDDI service directory approach that supports the storage of such metadata via a tunnelling technique that ties the metadata store to the original UDDI directory. They also discuss the use of a rich, graph-based RDF query language for syntactic queries on this data. Finally, they analyse the performance of each of these contributions in our implementation.
Resumo:
We take a broad view that ultimately Grid- or Web-services must be located via personalised, semantic-rich discovery processes. We argue that such processes must rely on the storage of arbitrary metadata about services that originates from both service providers and service users. Examples of such metadata are reliability metrics, quality of service data, or semantic service description markup. This paper presents UDDI-MT, an extension to the standard UDDI service directory approach that supports the storage of such metadata via a tunnelling technique that ties the metadata store to the original UDDI directory. We also discuss the use of a rich, graph-based RDF query language for syntactic queries on this data. Finally, we analyse the performance of each of these contributions in our implementation.
Resumo:
Service discovery in large scale, open distributed systems is difficult because of the need to filter out services suitable to the task at hand from a potentially huge pool of possibilities. Semantic descriptions have been advocated as the key to expressive service discovery, but the most commonly used service descriptions and registry protocols do not support such descriptions in a general manner. In this paper, we present a protocol, its implementation and an API for registering semantic service descriptions and other task/user-specific metadata, and for discovering services according to these. Our approach is based on a mechanism for attaching structured and unstructured metadata, which we show to be applicable to multiple registry technologies. The result is an extremely flexible service registry that can be the basis of a sophisticated semantically-enhanced service discovery engine, an essential component of a Semantic Grid.
Resumo:
The Grid is a large-scale computer system that is capable of coordinating resources that are not subject to centralised control, whilst using standard, open, general-purpose protocols and interfaces, and delivering non-trivial qualities of service. In this chapter, we argue that Grid applications very strongly suggest the use of agent-based computing, and we review key uses of agent technologies in Grids: user agents, able to customize and personalise data; agent communication languages offering a generic and portable communication medium; and negotiation allowing multiple distributed entities to reach service level agreements. In the second part of the chapter, we focus on Grid service discovery, which we have identified as a prime candidate for use of agent technologies: we show that Grid-services need to be located via personalised, semantic-rich discovery processes, which must rely on the storage of arbitrary metadata about services that originates from both service providers and service users. We present UDDI-MT, an extension to the standard UDDI service directory approach that supports the storage of such metadata via a tunnelling technique that ties the metadata store to the original UDDI directory. The outcome is a flexible service registry which is compatible with existing standards and also provides metadata-enhanced service discovery.
Resumo:
Existing registry technologies such as UDDI can be enhanced to support capabilities for semantic reasoning and inquiry, which subsequently increases its usability range. The Grimoires registry was developed to provide such support through the use of metadata attachments to registry entities. The use of such attachments provides a way for allowing service operators to specify security assertions pertaining to registry entities owned by them. These assertions may however have to be reconciled with existing registry policies. A security architecture based on the XACML standard and deployed in the OMII framework is outlined to demonstrate how this goal is achieved in the registry.
Resumo:
HydroShare is an online, collaborative system being developed for open sharing of hydrologic data and models. The goal of HydroShare is to enable scientists to easily discover and access hydrologic data and models, retrieve them to their desktop or perform analyses in a distributed computing environment that may include grid, cloud or high performance computing model instances as necessary. Scientists may also publish outcomes (data, results or models) into HydroShare, using the system as a collaboration platform for sharing data, models and analyses. HydroShare is expanding the data sharing capability of the CUAHSI Hydrologic Information System by broadening the classes of data accommodated, creating new capability to share models and model components, and taking advantage of emerging social media functionality to enhance information about and collaboration around hydrologic data and models. One of the fundamental concepts in HydroShare is that of a Resource. All content is represented using a Resource Data Model that separates system and science metadata and has elements common to all resources as well as elements specific to the types of resources HydroShare will support. These will include different data types used in the hydrology community and models and workflows that require metadata on execution functionality. The HydroShare web interface and social media functions are being developed using the Drupal content management system. A geospatial visualization and analysis component enables searching, visualizing, and analyzing geographic datasets. The integrated Rule-Oriented Data System (iRODS) is being used to manage federated data content and perform rule-based background actions on data and model resources, including parsing to generate metadata catalog information and the execution of models and workflows. This presentation will introduce the HydroShare functionality developed to date, describe key elements of the Resource Data Model and outline the roadmap for future development.
Resumo:
The Problem/Opportunity: To define, identify, and guide design-based materials collections in academic settings and foster community among those with existing collections and/or those considering creating and supporting one. Contents and topics: What is a materials collection? Why have a materials collection? Acquisition strategies Organizational approaches Programming possibilities Symposium summary Resources