985 resultados para Open Science Data Cloud
Resumo:
Workshop at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
Traditionally, the formal scientific output in most fields of natural science has been limited to peer- reviewed academic journal publications, with less attention paid to the chain of intermediate data results and their associated metadata, including provenance. In effect, this has constrained the representation and verification of the data provenance to the confines of the related publications. Detailed knowledge of a dataset’s provenance is essential to establish the pedigree of the data for its effective re-use, and to avoid redundant re-enactment of the experiment or computation involved. It is increasingly important for open-access data to determine their authenticity and quality, especially considering the growing volumes of datasets appearing in the public domain. To address these issues, we present an approach that combines the Digital Object Identifier (DOI) – a widely adopted citation technique – with existing, widely adopted climate science data standards to formally publish detailed provenance of a climate research dataset as an associated scientific workflow. This is integrated with linked-data compliant data re-use standards (e.g. OAI-ORE) to enable a seamless link between a publication and the complete trail of lineage of the corresponding dataset, including the dataset itself.
Resumo:
La tesi si propone di mettere in evidenza alcuni esempi pratici di utilizzo degli Open Data sia in ambito delle pubbliche amministrazioni sia in quello più ampio della sensibilità ambientale, con esempi anche di utilizzi particolarmente interessanti.
Resumo:
L'elaborato introduce i concetti di Big Data, Cloud Computing e le tipologie di paradigmi basati sul calcolo parallelo. Trasformando questi concetti in pratica tramite un caso di studio sui Big Data. Nell'elaborato si spiega l'architettura proposta per l'elaborazione di report in formato pdf. Analaizando in fine i risultati ottenuti.
Resumo:
We compiled a database of bacterial abundance of 39 766 data points. After gridding with 1° spacing, the database covers 1.3% of the ocean surface. There is data covering all ocean basins and depth except the Southern Hemisphere below 350 m or from April until June. The average bacterial biomass is 3.9 ± 3.6 µg l-1 with a 20-fold decrease between the surface and the deep sea. We estimate a total ocean inventory of about 1.3 - 1029 bacteria. Using an average of published open ocean measurements for the conversion from abundance to carbon biomass of 9.1 fg cell-1, we calculate a bacterial carbon inventory of about 1.2 Pg C. The main source of uncertainty in this inventory is the conversion factor from abundance to biomass.
Resumo:
We present the data structures and algorithms used in the approach for building domain ontologies from folksonomies and linked data. In this approach we extracts domain terms from folksonomies and enrich them with semantic information from the Linked Open Data cloud. As a result, we obtain a domain ontology that combines the emergent knowledge of social tagging systems with formal knowledge from Ontologies.
Resumo:
We present a methodology for legacy language resource adaptation that generates domain-specific sentiment lexicons organized around domain entities described with lexical information and sentiment words described in the context of these entities. We explain the steps of the methodology and we give a working example of our initial results. The resulting lexicons are modelled as Linked Data resources by use of established formats for Linguistic Linked Data (lemon, NIF) and for linked sentiment expressions (Marl), thereby contributing and linking to existing Language Resources in the Linguistic Linked Open Data cloud.
Resumo:
The application of Linked Data technology to the publication of linguistic data promises to facilitate interoperability of these data and has lead to the emergence of the so called Linguistic Linked Data Cloud (LLD) in which linguistic data is published following the Linked Data principles. Three essential issues need to be addressed for such data to be easily exploitable by language technologies: i) appropriate machine-readable licensing information is needed for each dataset, ii) minimum quality standards for Linguistic Linked Data need to be defined, and iii) appropriate vocabularies for publishing Linguistic Linked Data resources are needed. We propose the notion of Licensed Linguistic Linked Data (3LD) in which different licensing models might co-exist, from totally open to more restrictive licenses through to completely closed datasets.
Resumo:
Recently, experts and practitioners in language resources have started recognizing the benefits of the linked data (LD) paradigm for the representation and exploitation of linguistic data on the Web. The adoption of the LD principles is leading to an emerging ecosystem of multilingual open resources that conform to the Linguistic Linked Open Data Cloud, in which datasets of linguistic data are interconnected and represented following common vocabularies, which facilitates linguistic information discovery, integration and access. In order to contribute to this initiative, this paper summarizes several key aspects of the representation of linguistic information as linked data from a practical perspective. The main goal of this document is to provide the basic ideas and tools for migrating language resources (lexicons, corpora, etc.) as LD on the Web and to develop some useful NLP tasks with them (e.g., word sense disambiguation). Such material was the basis of a tutorial imparted at the EKAW’14 conference, which is also reported in the paper.
Resumo:
We describe a domain ontology development approach that extracts domain terms from folksonomies and enrich them with data and vocabularies from the Linked Open Data cloud. As a result, we obtain lightweight domain ontologies that combine the emergent knowledge of social tagging systems with formal knowledge from Ontologies. In order to illustrate the feasibility of our approach, we have produced an ontology in the financial domain from tags available in Delicious, using DBpedia, OpenCyc and UMBEL as additional knowledge sources.
Resumo:
The revelation of the top-secret US intelligence-led PRISM Programme has triggered wide-ranging debates across Europe. Press reports have shed new light on the electronic surveillance ‘fishing expeditions’ of the US National Security Agency and the FBI into the world’s largest electronic communications companies. This Policy Brief by a team of legal specialists and political scientists addresses the main controversies raised by the PRISM affair and the policy challenges that it poses for the EU. Two main arguments are presented: First, the leaks over the PRISM programme have undermined the trust that EU citizens have in their governments and the European institutions to safeguard and protect their privacy; and second, the PRISM affair raises questions regarding the capacity of EU institutions to draw lessons from the past and to protect the data of its citizens and residents in the context of transatlantic relations. The Policy Brief puts forward a set of policy recommendations for the EU to follow and implement a robust data protection strategy in response to the affair.
Resumo:
The Tara Oceans Expedition (2009-2013) sampled the world oceans on board a 36 m long schooner, collecting environmental data and organisms from viruses to planktonic metazoans for later analyses using modern sequencing and state-of-the-art imaging technologies. Tara Oceans Data are particularly suited to study the genetic, morphological and functional diversity of plankton. The present dataset contains navigation and meteorological data measured during one campaign of the Tara Oceans Expedition. Latitude and Longitude were obtained from TSG data.
Resumo:
The Tara Oceans Expedition (2009-2013) sampled the world oceans on board a 36 m long schooner, collecting environmental data and organisms from viruses to planktonic metazoans for later analyses using modern sequencing and state-of-the-art imaging technologies. Tara Oceans Data are particularly suited to study the genetic, morphological and functional diversity of plankton. The present dataset contains navigation and meteorological data measured during one campaign of the Tara Oceans Expedition. Latitude and Longitude were obtained from TSG data.
Resumo:
PRELIDA (PREserving LInked DAta) is an FP7 Coordination Action funded by the European Commission under the Digital Preservation Theme. PRELIDA targets the particular stakeholders of the Linked Data community, including data providers, service providers, technology providers and end user communities. These stakeholders have not been traditionally targeted by the Digital Preservation community, and are typically not aware of the digital preservation solutions already available. So an important task of PRELIDA is to raise awareness of existing preservation solutions and to facilitate their uptake. At the same time, the Linked Data cloud has specific characteristics in terms of structuring, interlinkage, dynamicity and distribution that pose new challenges to the preservation community. PRELIDA organises in-depth discussions among the two communities to identify which of these characteristics require novel solutions, and to develop road maps for addressing the new challenges. PRELIDA will complete its lifecycle at the end of this year, and the talk will report about the major findings.