983 resultados para Data Standards


Relevância:

70.00% 70.00%

Publicador:

Resumo:

This lecture introduces an array of data sources that can be used to create new applications and visualisations, many examples of which are given. Additionally, there are a number of slides on open data standards, freedom of information requests and how to affect the future of open data.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Traditionally, the formal scientific output in most fields of natural science has been limited to peer- reviewed academic journal publications, with less attention paid to the chain of intermediate data results and their associated metadata, including provenance. In effect, this has constrained the representation and verification of the data provenance to the confines of the related publications. Detailed knowledge of a dataset’s provenance is essential to establish the pedigree of the data for its effective re-use, and to avoid redundant re-enactment of the experiment or computation involved. It is increasingly important for open-access data to determine their authenticity and quality, especially considering the growing volumes of datasets appearing in the public domain. To address these issues, we present an approach that combines the Digital Object Identifier (DOI) – a widely adopted citation technique – with existing, widely adopted climate science data standards to formally publish detailed provenance of a climate research dataset as an associated scientific workflow. This is integrated with linked-data compliant data re-use standards (e.g. OAI-ORE) to enable a seamless link between a publication and the complete trail of lineage of the corresponding dataset, including the dataset itself.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The Short-term Water Information and Forecasting Tools (SWIFT) is a suite of tools for flood and short-term streamflow forecasting, consisting of a collection of hydrologic model components and utilities. Catchments are modeled using conceptual subareas and a node-link structure for channel routing. The tools comprise modules for calibration, model state updating, output error correction, ensemble runs and data assimilation. Given the combinatorial nature of the modelling experiments and the sub-daily time steps typically used for simulations, the volume of model configurations and time series data is substantial and its management is not trivial. SWIFT is currently used mostly for research purposes but has also been used operationally, with intersecting but significantly different requirements. Early versions of SWIFT used mostly ad-hoc text files handled via Fortran code, with limited use of netCDF for time series data. The configuration and data handling modules have since been redesigned. The model configuration now follows a design where the data model is decoupled from the on-disk persistence mechanism. For research purposes the preferred on-disk format is JSON, to leverage numerous software libraries in a variety of languages, while retaining the legacy option of custom tab-separated text formats when it is a preferred access arrangement for the researcher. By decoupling data model and data persistence, it is much easier to interchangeably use for instance relational databases to provide stricter provenance and audit trail capabilities in an operational flood forecasting context. For the time series data, given the volume and required throughput, text based formats are usually inadequate. A schema derived from CF conventions has been designed to efficiently handle time series for SWIFT.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Observational data encodes values of properties associated with a feature of interest, estimated by a specified procedure. For water the properties are physical parameters like level, volume, flow and pressure, and concentrations and counts of chemicals, substances and organisms. Water property vocabularies have been assembled at project, agency and jurisdictional level. Organizations such as EPA, USGS, CEH, GA and BoM maintain vocabularies for internal use, and may make them available externally as text files. BODC and MMI have harvested many water vocabularies alongside others of interest in their domain, formalized the content using SKOS, and published them through web interfaces. Scope is highly variable both within and between vocabularies. Individual items may conflate multiple concerns (e.g. property, instrument, statistical procedure, units). There is significant duplication between vocabularies. Semantic web technologies provide the opportunity both to publish vocabularies more effectively, and achieve harmonization to support greater interoperability between datasets. - Models for vocabulary items (property, substance/taxon, process, unit-of-measure, etc) may be formalized OWL ontologies, supporting semantic relations between items in related vocabularies; - By specializing the ontology elements from SKOS concepts and properties, diverse vocabularies may be published through a common interface; - Properties from standard vocabularies (e.g. OWL, SKOS, PROV-O and VAEM) support mappings between vocabularies having a similar scope - Existing items from various sources may be assembled into new virtual vocabularies However, there are a number of challenges: - use of standard properties such as sameAs/exactMatch/equivalentClass require reasoning support; - items have been conceptualised as both classes and individuals, complicating the mapping mechanics; - re-use of items across vocabularies may conflict with expectations concerning URI patterns; - versioning complicates cross-references and re-use. This presentation will discuss ways to harness semantic web technologies to publish harmonized vocabularies, and will summarise how many of the challenges may be addressed.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Government agencies use information technology extensively to collect business data for regulatory purposes. Data communication standards form part of the infrastructure with which businesses must conform to survive. We examine the development of, and emerging competition between, two open business reporting data standards adopted by government bodies in France; EDIFACT (incumbent) and XBRL (challenger). The research explores whether an incumbent may be displaced in a setting in which the contention is unresolved. We apply Latour’s (1992) translation map to trace the enrolments and detours in the battle. We find that regulators play an important role as allies in the development of the standards. The antecedent networks in which the standards are located embed strong beliefs that become barriers to collaboration and fuel the battle. One of the key differentiating attitudes is whether speed is more important than legitimacy. The failure of collaboration encourages competition. The newness of XBRL’s technology just as regulators need to respond to an economic crisis and its adoption by French regulators not using EDIFACT create an opportunity for the challenger to make significant network gains over the longer term. ANT also highlights the importance of the preservation of key components of EDIFACT in ebXML.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Learning Analytics is an emerging field focused on analyzing learners’ interactions with educational content. One of the key open issues in learning analytics is the standardization of the data collected. This is a particularly challenging issue in serious games, which generate a diverse range of data. This paper reviews the current state of learning analytics, data standards and serious games, studying how serious games are tracking the interactions from their players and the metrics that can be distilled from them. Based on this review, we propose an interaction model that establishes a basis for applying Learning Analytics into serious games. This paper then analyzes the current standards and specifications used in the field. Finally, it presents an implementation of the model with one of the most promising specifications: Experience API (xAPI). The Experience API relies on Communities of Practice developing profiles that cover different use cases in specific domains. This paper presents the Serious Games xAPI Profile: a profile developed to align with the most common use cases in the serious games domain. The profile is applied to a case study (a demo game), which explores the technical practicalities of standardizing data acquisition in serious games. In summary, the paper presents a new interaction model to track serious games and their implementation with the xAPI specification.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Every Argo data file submitted by a DAC for distribution on the GDAC has its format and data consistency checked by the Argo FileChecker. Two types of checks are applied: 1. Format checks. Ensures the file formats match the Argo standards precisely. 2. Data consistency checks. Additional data consistency checks are performed on a file after it passes the format checks. These checks do not duplicate any of the quality control checks performed elsewhere. These checks can be thought of as “sanity checks” to ensure that the data are consistent with each other. The data consistency checks enforce data standards and ensure that certain data values are reasonable and/or consistent with other information in the files. Examples of the “data standard” checks are the “mandatory parameters” defined for meta-data files and the technical parameter names in technical data files. Files with format or consistency errors are rejected by the GDAC and are not distributed. Less serious problems will generate warnings and the file will still be distributed on the GDAC. Reference Tables and Data Standards: Many of the consistency checks involve comparing the data to the published reference tables and data standards. These tables are documented in the User’s Manual. (The FileChecker implements “text versions” of these tables.)

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Dissertation to obtain the Master degree in Electrical Engineering and Computer Science

Relevância:

60.00% 60.00%

Publicador:

Resumo:

��In a sign that researchers are grappling with therapy development, the 4th annual conference on Clinical Trials in Alzheimer's Disease was filled beyond its venue's capacity, drawing 522 researchers from around the globe. Held 3-5 November 2011 in San Diego, CTAD is the brainchild of Paul Aisen, Jacques Touchon, Bruno Vellas, and Michael Weiner. The conference posted no ringing trial successes. Instead, scientists worked on methodological aspects they hope will improve future trials' chances. They discussed Bayesian models, simulated placebos, and biomarker data standards. They presented alternative outcome measures to the ADAS-cog, ranging widely from composite scales that are sensitive early on to continuous measures that encompass a patients' day-to-day variability. They focused on EEG, and on a collective effort to develop patient-reported outcomes. Highlights include:Whence and Where To: History and Future of AD Therapy Trials��Webinar: Evolution of AD Trials��Nutrient Formulation Appears to Grease Memory Function��Door Slams on RAGE��Clinical Trials: Making "Protocols From Hell" Less Burdensome��EEG: Coming in From the Margins of Alzheimer's Research?��EEG: Old Method to Lend New Help in AD Drug Development?������

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Kokouksen esitysten verkko-osoite: http://www.geoinfo.tuwien.ac.at/events/Euresco2000/gdgis.htm

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Many online services access a large number of autonomous data sources and at the same time need to meet different user requirements. It is essential for these services to achieve semantic interoperability among these information exchange entities. In the presence of an increasing number of proprietary business processes, heterogeneous data standards, and diverse user requirements, it is critical that the services are implemented using adaptable, extensible, and scalable technology. The COntext INterchange (COIN) approach, inspired by similar goals of the Semantic Web, provides a robust solution. In this paper, we describe how COIN can be used to implement dynamic online services where semantic differences are reconciled on the fly. We show that COIN is flexible and scalable by comparing it with several conventional approaches. With a given ontology, the number of conversions in COIN is quadratic to the semantic aspect that has the largest number of distinctions. These semantic aspects are modeled as modifiers in a conceptual ontology; in most cases the number of conversions is linear with the number of modifiers, which is significantly smaller than traditional hard-wiring middleware approach where the number of conversion programs is quadratic to the number of sources and data receivers. In the example scenario in the paper, the COIN approach needs only 5 conversions to be defined while traditional approaches require 20,000 to 100 million. COIN achieves this scalability by automatically composing all the comprehensive conversions from a small number of declaratively defined sub-conversions.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The FunFOLD2 server is a new independent server that integrates our novel protein–ligand binding site and quality assessment protocols for the prediction of protein function (FN) from sequence via structure. Our guiding principles were, first, to provide a simple unified resource to make our function prediction software easily accessible to all via a simple web interface and, second, to produce integrated output for predictions that can be easily interpreted. The server provides a clean web interface so that results can be viewed on a single page and interpreted by non-experts at a glance. The output for the prediction is an image of the top predicted tertiary structure annotated to indicate putative ligand-binding site residues. The results page also includes a list of the most likely binding site residues and the types of predicted ligands and their frequencies in similar structures. The protein–ligand interactions can also be interactively visualized in 3D using the Jmol plug-in. The raw machine readable data are provided for developers, which comply with the Critical Assessment of Techniques for Protein Structure Prediction data standards for FN predictions. The FunFOLD2 webserver is freely available to all at the following web site: http://www.reading.ac.uk/bioinf/FunFOLD/FunFOLD_form_2_0.html.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

La web semántica aporta un mayor conocimiento a los datos para que estos puedan ser procesados por las máquinas. Esto es posible gracias a estándares como por ejemplo Resource Framework Description (RDF). Éste, aporta un marco para que la información pueda ser representada de una manera más comprensible para las maquinas. Muchas veces la información no se encuentra codificada en RDF pero igualmente es interesante aprovecharse de sus características. Es por ello que surge la necesidad de crear una herramienta que permita consultas entre distintas fuentes de datos apoyándose en el estándar RDF independientemente del formato de origen de los datos. De esta manera se conseguirá realizar consultas entre las diversas fuentes, las cuales, sin la unificación en un estándar semántico, serían mucho más difíciles de conseguir.---ABSTRACT---The Semantic Web provides a new knowledge framework to data, therefore computers would become capable of analyzing the data. Standards, as Resource Framework Description (RDF), help to achieve it. RDF promotes the easier way for computers on how to describe data. Sometimes data are coded in a different way from RDF, nevertheless it would also be interesting to examine it. Accordingly, the need to create new software emerges. The software, based on RDF, would be able to combine information from different sources regardless of its format. Consequently, several sources, whatever their original formats were, could be queried on an easier way since a common semantic standard is available.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Alistair Milne argues in this ECRI Commentary that ‘FinTech’ (newly emerging Financial Technologies) can play a crucial role in achieving European policy objectives in the area of financial markets. These notably include increasing access by smaller firms to trade credit and other forms of external finance and completing the banking and capital markets unions. He points out, however, that accomplishing these objectives will require a coordinated European policy response, focused especially on promoting common business processes and the adoption of shared technology and data standards.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We describe the creation process of the Minimum Information Specification for In Situ Hybridization and Immunohistochemistry Experiments (MISFISHIE). Modeled after the existing minimum information specification for microarray data, we created a new specification for gene expression localization experiments, initially to facilitate data sharing within a consortium. After successful use within the consortium, the specification was circulated to members of the wider biomedical research community for comment and refinement. After a period of acquiring many new suggested requirements, it was necessary to enter a final phase of excluding those requirements that were deemed inappropriate as a minimum requirement for all experiments. The full specification will soon be published as a version 1.0 proposal to the community, upon which a more full discussion must take place so that the final specification may be achieved with the involvement of the whole community. This paper is part of the special issue of OMICS on data standards.