37 resultados para RDF
Resumo:
OGOLOD is a Linked Open Data dataset derived from different biomedical resources by an automated pipeline, using a tailored ontology as a scaffold. The key contribution of OGOLOD is that it links, in new RDF triples, genetic human diseases and orthologous genes, paving the way for a more efficient translational biomedical research exploiting the Linked Open Data cloud.
Resumo:
Recently we have seen a large increase in the amount of geospatial data that is being published using RDF and Linked Data principles. Eorts such as the W3C Geo XG, and most recently the GeoSPARQL initiative are providing the necessary vocabularies to pub- lish this kind of information on the Web of Data. In this context it is necessary to develop applications that consume and take advantage of these geospatial datasets. In this paper we present map4rdf, a faceted browsing tool for exploring and visualizing RDF datasets enhanced with geospatial information.
Resumo:
This PhD thesis contributes to the problem of resource and service discovery in the context of the composable web. In the current web, mashup technologies allow developers reusing services and contents to build new web applications. However, developers face a problem of information flood when searching for appropriate services or resources for their combination. To contribute to overcoming this problem, a framework is defined for the discovery of services and resources. In this framework, three levels are defined for performing discovery at content, discovery and agente levels. The content level involves the information available in web resources. The web follows the Representational Stateless Transfer (REST) architectural style, in which resources are returned as representations from servers to clients. These representations usually employ the HyperText Markup Language (HTML), which, along with Content Style Sheets (CSS), describes the markup employed to render representations in a web browser. Although the use of SemanticWeb standards such as Resource Description Framework (RDF) make this architecture suitable for automatic processes to use the information present in web resources, these standards are too often not employed, so automation must rely on processing HTML. This process, often referred as Screen Scraping in the literature, is the content discovery according to the proposed framework. At this level, discovery rules indicate how the different pieces of data in resources’ representations are mapped onto semantic entities. By processing discovery rules on web resources, semantically described contents can be obtained out of them. The service level involves the operations that can be performed on the web. The current web allows users to perform different tasks such as search, blogging, e-commerce, or social networking. To describe the possible services in RESTful architectures, a high-level feature-oriented service methodology is proposed at this level. This lightweight description framework allows defining service discovery rules to identify operations in interactions with REST resources. The discovery is thus performed by applying discovery rules to contents discovered in REST interactions, in a novel process called service probing. Also, service discovery can be performed by modelling services as contents, i.e., by retrieving Application Programming Interface (API) documentation and API listings in service registries such as ProgrammableWeb. For this, a unified model for composable components in Mashup-Driven Development (MDD) has been defined after the analysis of service repositories from the web. The agent level involves the orchestration of the discovery of services and contents. At this level, agent rules allow to specify behaviours for crawling and executing services, which results in the fulfilment of a high-level goal. Agent rules are plans that allow introspecting the discovered data and services from the web and the knowledge present in service and content discovery rules to anticipate the contents and services to be found on specific resources from the web. By the definition of plans, an agent can be configured to target specific resources. The discovery framework has been evaluated on different scenarios, each one covering different levels of the framework. Contenidos a la Carta project deals with the mashing-up of news from electronic newspapers, and the framework was used for the discovery and extraction of pieces of news from the web. Similarly, in Resulta and VulneraNET projects the discovery of ideas and security knowledge in the web is covered, respectively. The service level is covered in the OMELETTE project, where mashup components such as services and widgets are discovered from component repositories from the web. The agent level is applied to the crawling of services and news in these scenarios, highlighting how the semantic description of rules and extracted data can provide complex behaviours and orchestrations of tasks in the web. The main contributions of the thesis are the unified framework for discovery, which allows configuring agents to perform automated tasks. Also, a scraping ontology has been defined for the construction of mappings for scraping web resources. A novel first-order logic rule induction algorithm is defined for the automated construction and maintenance of these mappings out of the visual information in web resources. Additionally, a common unified model for the discovery of services is defined, which allows sharing service descriptions. Future work comprises the further extension of service probing, resource ranking, the extension of the Scraping Ontology, extensions of the agent model, and contructing a base of discovery rules. Resumen La presente tesis doctoral contribuye al problema de descubrimiento de servicios y recursos en el contexto de la web combinable. En la web actual, las tecnologías de combinación de aplicaciones permiten a los desarrolladores reutilizar servicios y contenidos para construir nuevas aplicaciones web. Pese a todo, los desarrolladores afrontan un problema de saturación de información a la hora de buscar servicios o recursos apropiados para su combinación. Para contribuir a la solución de este problema, se propone un marco de trabajo para el descubrimiento de servicios y recursos. En este marco, se definen tres capas sobre las que se realiza descubrimiento a nivel de contenido, servicio y agente. El nivel de contenido involucra a la información disponible en recursos web. La web sigue el estilo arquitectónico Representational Stateless Transfer (REST), en el que los recursos son devueltos como representaciones por parte de los servidores a los clientes. Estas representaciones normalmente emplean el lenguaje de marcado HyperText Markup Language (HTML), que, unido al estándar Content Style Sheets (CSS), describe el marcado empleado para mostrar representaciones en un navegador web. Aunque el uso de estándares de la web semántica como Resource Description Framework (RDF) hace apta esta arquitectura para su uso por procesos automatizados, estos estándares no son empleados en muchas ocasiones, por lo que cualquier automatización debe basarse en el procesado del marcado HTML. Este proceso, normalmente conocido como Screen Scraping en la literatura, es el descubrimiento de contenidos en el marco de trabajo propuesto. En este nivel, un conjunto de reglas de descubrimiento indican cómo los diferentes datos en las representaciones de recursos se corresponden con entidades semánticas. Al procesar estas reglas sobre recursos web, pueden obtenerse contenidos descritos semánticamente. El nivel de servicio involucra las operaciones que pueden ser llevadas a cabo en la web. Actualmente, los usuarios de la web pueden realizar diversas tareas como búsqueda, blogging, comercio electrónico o redes sociales. Para describir los posibles servicios en arquitecturas REST, se propone en este nivel una metodología de alto nivel para descubrimiento de servicios orientada a funcionalidades. Este marco de descubrimiento ligero permite definir reglas de descubrimiento de servicios para identificar operaciones en interacciones con recursos REST. Este descubrimiento es por tanto llevado a cabo al aplicar las reglas de descubrimiento sobre contenidos descubiertos en interacciones REST, en un nuevo procedimiento llamado sondeo de servicios. Además, el descubrimiento de servicios puede ser llevado a cabo mediante el modelado de servicios como contenidos. Es decir, mediante la recuperación de documentación de Application Programming Interfaces (APIs) y listas de APIs en registros de servicios como ProgrammableWeb. Para ello, se ha definido un modelo unificado de componentes combinables para Mashup-Driven Development (MDD) tras el análisis de repositorios de servicios de la web. El nivel de agente involucra la orquestación del descubrimiento de servicios y contenidos. En este nivel, las reglas de nivel de agente permiten especificar comportamientos para el rastreo y ejecución de servicios, lo que permite la consecución de metas de mayor nivel. Las reglas de los agentes son planes que permiten la introspección sobre los datos y servicios descubiertos, así como sobre el conocimiento presente en las reglas de descubrimiento de servicios y contenidos para anticipar contenidos y servicios por encontrar en recursos específicos de la web. Mediante la definición de planes, un agente puede ser configurado para descubrir recursos específicos. El marco de descubrimiento ha sido evaluado sobre diferentes escenarios, cada uno cubriendo distintos niveles del marco. El proyecto Contenidos a la Carta trata de la combinación de noticias de periódicos digitales, y en él el framework se ha empleado para el descubrimiento y extracción de noticias de la web. De manera análoga, en los proyectos Resulta y VulneraNET se ha llevado a cabo un descubrimiento de ideas y de conocimientos de seguridad, respectivamente. El nivel de servicio se cubre en el proyecto OMELETTE, en el que componentes combinables como servicios y widgets se descubren en repositorios de componentes de la web. El nivel de agente se aplica al rastreo de servicios y noticias en estos escenarios, mostrando cómo la descripción semántica de reglas y datos extraídos permiten proporcionar comportamientos complejos y orquestaciones de tareas en la web. Las principales contribuciones de la tesis son el marco de trabajo unificado para descubrimiento, que permite configurar agentes para realizar tareas automatizadas. Además, una ontología de extracción ha sido definida para la construcción de correspondencias y extraer información de recursos web. Asimismo, un algoritmo para la inducción de reglas de lógica de primer orden se ha definido para la construcción y el mantenimiento de estas correspondencias a partir de la información visual de recursos web. Adicionalmente, se ha definido un modelo común y unificado para el descubrimiento de servicios que permite la compartición de descripciones de servicios. Como trabajos futuros se considera la extensión del sondeo de servicios, clasificación de recursos, extensión de la ontología de extracción y la construcción de una base de reglas de descubrimiento.
Resumo:
RDB2RDF systems generate RDF from relational databases, operating in two dierent manners: materializing the database content into RDF or acting as virtual RDF datastores that transform SPARQL queries into SQL. In the former, inferences on the RDF data (taking into account the ontologies that they are related to) are normally done by the RDF triple store where the RDF data is materialised and hence the results of the query answering process depend on the store. In the latter, existing RDB2RDF systems do not normally perform such inferences at query time. This paper shows how the algorithm used in the REQUIEM system, focused on handling run-time inferences for query answering, can be adapted to handle such inferences for query answering in combination with RDB2RDF systems.
Resumo:
RDB2RDF systems generate RDF from relational databases, operating in two di�erent manners: materializing the database content into RDF or acting as virtual RDF datastores that transform SPARQL queries into SQL. In the former, inferences on the RDF data (taking into account the ontologies that they are related to) are normally done by the RDF triple store where the RDF data is materialised and hence the results of the query answering process depend on the store. In the latter, existing RDB2RDF systems do not normally perform such inferences at query time. This paper shows how the algorithm used in the REQUIEM system, focused on handling run-time inferences for query answering, can be adapted to handle such inferences for query answering in combination with RDB2RDF systems.
Resumo:
Linked data offers a promising setting to encode, publish and share metadata of resources. As the matter of fact, it is already adopted by data producers such as European Environment Agency, US and some EU Governs, whose first ambition is to share (meta)data making their processes more effective and transparent. Such as an increasing interest and involvement of data providers surely represents a genuine witness of the web of data success, but in a longer perspective, frameworks supporting linked data consumers in their decision making processes will be a compelling need. In this respect, the talk is introducing SSONDE, a framework enabling in detailed comparison, ranking and selection of linked data resources through the analysis of their RDF ontology driven metadata. SSONDE implements an instance similarity especially designed to support in resource selection, namely the process stakeholders engage to choose a set of resources suitable for a given analysis purpose: (i) it deploys an asymmetric similarity assessment to emphasize information about gains and losses the stakeholders get adopting a resource in place of another; (ii) it relies on an explicit formalization of contexts to tailor the similarity assessment with respect to specific user-defined selection goals. The talk aims at providing an insight on SSONDE instance similarity and it will briefly describe some examples of SSONDE deployment in the context of linked data consumption.
Resumo:
This paper describes the process followed in order to make some of the public meterological data from the Agencia Estatal de Meteorología (AEMET, Spanish Meteorological Office) available as Linked Data. The method followed has been already used to publish geographical, statistical, and leisure data. The data selected for publication are generated every ten minutes by the 250 automatic stations that belong to AEMET and that are deployed across Spain. These data are available as spreadsheets in the AEMET data catalog, and contain more than twenty types of measurements per station. Spreadsheets are retrieved from the website, processed with Python scripts, transformed to RDF according to an ontology network about meteorology that reuses the W3C SSN Ontology, published in a triple store and visualized in maps with Map4rdf.
Resumo:
Linked Data is not always published with a license. Sometimes a wrong license type is used, like a license for software, or it is not expressed in a standard, machine readable manner. Yet, Linked Data resources may be subject to intellectual property and database laws, may contain personal data subject to privacy restrictions or may even contain important trade secrets. The proper declaration of which rights are held, waived or licensed is a must for the lawful use of Linked Data at its different granularity levels, from the simple RDF statement to a dataset or a mapping. After comparing the current practice with the actual needs, six research questions are posed.
Resumo:
The use of semantic and Linked Data technologies for Enterprise Application Integration (EAI) is increasing in recent years. Linked Data and Semantic Web technologies such as the Resource Description Framework (RDF) data model provide several key advantages over the current de-facto Web Service and XML based integration approaches. The flexibility provided by representing the data in a more versatile RDF model using ontologies enables avoiding complex schema transformations and makes data more accessible using Web standards, preventing the formation of data silos. These three benefits represent an edge for Linked Data-based EAI. However, work still has to be performed so that these technologies can cope with the particularities of the EAI scenarios in different terms, such as data control, ownership, consistency, or accuracy. The first part of the paper provides an introduction to Enterprise Application Integration using Linked Data and the requirements imposed by EAI to Linked Data technologies focusing on one of the problems that arise in this scenario, the coreference problem, and presents a coreference service that supports the use of Linked Data in EAI systems. The proposed solution introduces the use of a context that aggregates a set of related identities and mappings from the identities to different resources that reside in distinct applications and provide different views or aspects of the same entity. A detailed architecture of the Coreference Service is presented explaining how it can be used to manage the contexts, identities, resources, and applications which they relate to. The paper shows how the proposed service can be utilized in an EAI scenario using an example involving a dashboard that integrates data from different systems and the proposed workflow for registering and resolving identities. As most enterprise applications are driven by business processes and involve legacy data, the proposed approach can be easily incorporated into enterprise applications.
Resumo:
In this demo paper we describe an iOS-based application that allows visualizing live bus transport data in Madrid from static and streaming RDF endpoints, reusing the Web services provided by the bus transport authority in the city and wrapping them using SPARQLStream
Resumo:
In this article, we argue that there is a growing number of linked datasets in different natural languages, and that there is a need for guidelines and mechanisms to ensure the quality and organic growth of this emerging multilingual data network. However, we have little knowledge regarding the actual state of this data network, its current practices, and the open challenges that it poses. Questions regarding the distribution of natural languages, the links that are established across data in different languages, or how linguistic features are represented, remain mostly unanswered. Addressing these and other language-related issues can help to identify existing problems, propose new mechanisms and guidelines or adapt the ones in use for publishing linked data including language-related features, and, ultimately, provide metrics to evaluate quality aspects. In this article we review, discuss, and extend current guidelines for publishing linked data by focusing on those methods, techniques and tools that can help RDF publishers to cope with language barriers. Whenever possible, we will illustrate and discuss each of these guidelines, methods, and tools on the basis of practical examples that we have encountered in the publication of the datos.bne.es dataset.
Resumo:
In this paper we describe the specification of amodel for the semantically interoperable representation of language resources for sentiment analysis. The model integrates "lemon", an RDF-based model for the specification of ontology-lexica (Buitelaar et al. 2009), which is used increasinglyfor the representation of language resources asLinked Data, with Marl, an RDF-based model for the representation of sentiment annotations (West-erski et al., 2011; Sánchez-Rada et al., 2013)
Resumo:
Within the European Union, member states are setting up official data catalogues as entry points to access PSI (Public Sector Information). In this context, it is important to describe the metadata of these data portals, i.e., of data catalogs, and allow for interoperability among them. To tackle these issues, the Government Linked Data Working Group developed DCAT (Data Catalog Vocabulary), an RDF vocabulary for describing the metadata of data catalogs. This topic report analyzes the current use of the DCAT vocabulary in several European data catalogs and proposes some recommendations to deal with an inconsistent use of the metadata across countries. The enrichment of such metadata vocabularies with multilingual descriptions, as well as an account for cultural divergences, is seen as a necessary step to guarantee interoperability and ensure wider adoption.
Resumo:
The W3C Linked Data Platform (LDP) candidate recom- mendation defines a standard HTTP-based protocol for read/write Linked Data. The W3C R2RML recommendation defines a language to map re- lational databases (RDBs) and RDF. This paper presents morph-LDP, a novel system that combines these two W3C standardization initiatives to expose relational data as read/write Linked Data for LDP-aware ap- plications, whilst allowing legacy applications to continue using their relational databases.
Resumo:
The Web of Data currently comprises ? 62 billion triples from more than 2,000 different datasets covering many fields of knowledge3. This volume of structured Linked Data can be seen as a particular case of Big Data, referred to as Big Semantic Data [4]. Obviously, powerful computational configurations are tradi- tionally required to deal with the scalability problems arising to Big Semantic Data. It is not surprising that this ?data revolution? has competed in parallel with the growth of mobile computing. Smartphones and tablets are massively used at the expense of traditional computers but, to date, mobile devices have more limited computation resources. Therefore, one question that we may ask ourselves would be: can (potentially large) semantic datasets be consumed natively on mobile devices? Currently, only a few mobile apps (e.g., [1, 9, 2, 8]) make use of semantic data that they store in the mobile devices, while many others access existing SPARQL endpoints or Linked Data directly. Two main reasons can be considered for this fact. On the one hand, in spite of some initial approaches [6, 3], there are no well-established triplestores for mobile devices. This is an important limitation because any po- tential app must assume both RDF storage and SPARQL resolution. On the other hand, the particular features of these devices (little storage space, less computational power or more limited bandwidths) limit the adoption of seman- tic data for different uses and purposes. This paper introduces our HDTourist mobile application prototype. It con- sumes urban data from DBpedia4 to help tourists visiting a foreign city. Although it is a simple app, its functionality allows illustrating how semantic data can be stored and queried with limited resources. Our prototype is implemented for An- droid, but its foundations, explained in Section 2, can be deployed in any other platform. The app is described in Section 3, and Section 4 concludes about our current achievements and devises the future work.