241 resultados para RDF Reification
Resumo:
The use of semantic and Linked Data technologies for Enterprise Application Integration (EAI) is increasing in recent years. Linked Data and Semantic Web technologies such as the Resource Description Framework (RDF) data model provide several key advantages over the current de-facto Web Service and XML based integration approaches. The flexibility provided by representing the data in a more versatile RDF model using ontologies enables avoiding complex schema transformations and makes data more accessible using Web standards, preventing the formation of data silos. These three benefits represent an edge for Linked Data-based EAI. However, work still has to be performed so that these technologies can cope with the particularities of the EAI scenarios in different terms, such as data control, ownership, consistency, or accuracy. The first part of the paper provides an introduction to Enterprise Application Integration using Linked Data and the requirements imposed by EAI to Linked Data technologies focusing on one of the problems that arise in this scenario, the coreference problem, and presents a coreference service that supports the use of Linked Data in EAI systems. The proposed solution introduces the use of a context that aggregates a set of related identities and mappings from the identities to different resources that reside in distinct applications and provide different views or aspects of the same entity. A detailed architecture of the Coreference Service is presented explaining how it can be used to manage the contexts, identities, resources, and applications which they relate to. The paper shows how the proposed service can be utilized in an EAI scenario using an example involving a dashboard that integrates data from different systems and the proposed workflow for registering and resolving identities. As most enterprise applications are driven by business processes and involve legacy data, the proposed approach can be easily incorporated into enterprise applications.
Resumo:
Los sistemas técnicos son cada vez más complejos, incorporan funciones más avanzadas, están más integrados con otros sistemas y trabajan en entornos menos controlados. Todo esto supone unas condiciones más exigentes y con mayor incertidumbre para los sistemas de control, a los que además se demanda un comportamiento más autónomo y fiable. La adaptabilidad de manera autónoma es un reto para tecnologías de control actualmente. El proyecto de investigación ASys propone abordarlo trasladando la responsabilidad de la capacidad de adaptación del sistema de los ingenieros en tiempo de diseño al propio sistema en operación. Esta tesis pretende avanzar en la formulación y materialización técnica de los principios de ASys de cognición y auto-consciencia basadas en modelos y autogestión de los sistemas en tiempo de operación para una autonomía robusta. Para ello el trabajo se ha centrado en la capacidad de auto-conciencia, inspirada en los sistemas biológicos, y se ha explorado la posibilidad de integrarla en la arquitectura de los sistemas de control. Además de la auto-consciencia, se han explorado otros temas relevantes: modelado funcional, modelado de software, tecnología de los patrones, tecnología de componentes, tolerancia a fallos. Se ha analizado el estado de la técnica en los ámbitos pertinentes para las cuestiones de la auto-consciencia y la adaptabilidad en sistemas técnicos: arquitecturas cognitivas, control tolerante a fallos, y arquitecturas software dinámicas y computación autonómica. El marco teórico de ASys existente de sistemas autónomos cognitivos ha sido adaptado para servir de base para este análisis de autoconsciencia y adaptación y para dar sustento conceptual al posterior desarrollo de la solución. La tesis propone una solución general de diseño para la construcción de sistemas autónomos auto-conscientes. La idea central es la integración de un meta-controlador en la arquitectura de control del sistema autónomo, capaz de percibir la estado funcional del sistema de control y, si es necesario, reconfigurarlo en tiempo de operación. Esta solución de metacontrol se ha formalizado en cuatro patrones de diseño: i) el Patrón Metacontrol, que define la integración de un subsistema de metacontrol, responsable de controlar al propio sistema de control a través de la interfaz proporcionada por su plataforma de componentes, ii) el patrón Bucle de Control Epistémico, que define un bucle de control cognitivo basado en el modelos y que se puede aplicar al diseño del metacontrol, iii) el patrón de Reflexión basada en Modelo Profundo propone una solución para construir el modelo ejecutable utilizado por el meta-controlador mediante una transformación de modelo a modelo a partir del modelo de ingeniería del sistema, y, finalmente, iv) el Patrón Metacontrol Funcional, que estructura el meta-controlador en dos bucles, uno para el control de la configuración de los componentes del sistema de control, y otro sobre éste, controlando las funciones que realiza dicha configuración de componentes; de esta manera las consideraciones funcionales y estructurales se desacoplan. La Arquitectura OM y el metamodelo TOMASys son las piezas centrales del marco arquitectónico desarrollado para materializar la solución compuesta de los patrones anteriores. El metamodelo TOMASys ha sido desarrollado para la representación de la estructura y su relación con los requisitos funcionales de cualquier sistema autónomo. La Arquitectura OM es un patrón de referencia para la construcción de una metacontrolador integrando los patrones de diseño propuestos. Este meta-controlador se puede integrar en la arquitectura de cualquier sistema control basado en componentes. El elemento clave de su funcionamiento es un modelo TOMASys del sistema decontrol, que el meta-controlador usa para monitorizarlo y calcular las acciones de reconfiguración necesarias para adaptarlo a las circunstancias en cada momento. Un proceso de ingeniería, complementado con otros recursos, ha sido elaborado para guiar la aplicación del marco arquitectónico OM. Dicho Proceso de Ingeniería OM define la metodología a seguir para construir el subsistema de metacontrol para un sistema autónomo a partir del modelo funcional del mismo. La librería OMJava proporciona una implementación del meta-controlador OM que se puede integrar en el control de cualquier sistema autónomo, independientemente del dominio de la aplicación o de su tecnología de implementación. Para concluir, la solución completa ha sido validada con el desarrollo de un robot móvil autónomo que incorpora un meta-controlador con la Arquitectura OM. Las propiedades de auto-consciencia y adaptación proporcionadas por el meta-controlador han sido validadas en diferentes escenarios de operación del robot, en los que el sistema era capaz de sobreponerse a fallos en el sistema de control mediante reconfiguraciones orquestadas por el metacontrolador. ABSTRACT Technical systems are becoming more complex, they incorporate more advanced functionalities, they are more integrated with other systems and they are deployed in less controlled environments. All this supposes a more demanding and uncertain scenario for control systems, which are also required to be more autonomous and dependable. Autonomous adaptivity is a current challenge for extant control technologies. The ASys research project proposes to address it by moving the responsibility for adaptivity from the engineers at design time to the system at run-time. This thesis has intended to advance in the formulation and technical reification of ASys principles of model-based self-cognition and having systems self-handle at runtime for robust autonomy. For that it has focused on the biologically inspired capability of self-awareness, and explored the possibilities to embed it into the very architecture of control systems. Besides self-awareness, other themes related to the envisioned solution have been explored: functional modeling, software modeling, patterns technology, components technology, fault tolerance. The state of the art in fields relevant for the issues of self-awareness and adaptivity has been analysed: cognitive architectures, fault-tolerant control, and software architectural reflection and autonomic computing. The extant and evolving ASys Theoretical Framework for cognitive autonomous systems has been adapted to provide a basement for this selfhood-centred analysis and to conceptually support the subsequent development of our solution. The thesis proposes a general design solution for building self-aware autonomous systems. Its central idea is the integration of a metacontroller in the control architecture of the autonomous system, capable of perceiving the functional state of the control system and reconfiguring it if necessary at run-time. This metacontrol solution has been formalised into four design patterns: i) the Metacontrol Pattern, which defines the integration of a metacontrol subsystem, controlling the domain control system through an interface provided by its implementation component platform, ii) the Epistemic Control Loop pattern, which defines a modelbased cognitive control loop that can be applied to the design of such a metacontroller, iii) the Deep Model Reflection pattern proposes a solution to produce the online executable model used by the metacontroller by model-to-model transformation from the engineering model, and, finally, iv) the Functional Metacontrol pattern, which proposes to structure the metacontroller in two loops, one for controlling the configuration of components of the controller, and another one on top of the former, controlling the functions being realised by that configuration; this way the functional and structural concerns become decoupled. The OM Architecture and the TOMASys metamodel are the core pieces of the architectural framework developed to reify this patterned solution. The TOMASys metamodel has been developed for representing the structure and its relation to the functional requirements of any autonomous system. The OM architecture is a blueprint for building a metacontroller according to the patterns. This metacontroller can be integrated on top of any component-based control architecture. At the core of its operation lies a TOMASys model of the control system. An engineering process and accompanying assets have been constructed to complete and exploit the architectural framework. The OM Engineering Process defines the process to follow to develop the metacontrol subsystem from the functional model of the controller of the autonomous system. The OMJava library provides a domain and application-independent implementation of an OM Metacontroller than can be used in the implementation phase of OMEP. Finally, the complete solution has been validated in the development of an autonomous mobile robot that incorporates an OM metacontroller. The functional selfawareness and adaptivity properties achieved thanks to the metacontrol system have been validated in different scenarios. In these scenarios the robot was able to overcome failures in the control system thanks to reconfigurations performed by the metacontroller.
Resumo:
In this demo paper we describe an iOS-based application that allows visualizing live bus transport data in Madrid from static and streaming RDF endpoints, reusing the Web services provided by the bus transport authority in the city and wrapping them using SPARQLStream
Resumo:
In this article, we argue that there is a growing number of linked datasets in different natural languages, and that there is a need for guidelines and mechanisms to ensure the quality and organic growth of this emerging multilingual data network. However, we have little knowledge regarding the actual state of this data network, its current practices, and the open challenges that it poses. Questions regarding the distribution of natural languages, the links that are established across data in different languages, or how linguistic features are represented, remain mostly unanswered. Addressing these and other language-related issues can help to identify existing problems, propose new mechanisms and guidelines or adapt the ones in use for publishing linked data including language-related features, and, ultimately, provide metrics to evaluate quality aspects. In this article we review, discuss, and extend current guidelines for publishing linked data by focusing on those methods, techniques and tools that can help RDF publishers to cope with language barriers. Whenever possible, we will illustrate and discuss each of these guidelines, methods, and tools on the basis of practical examples that we have encountered in the publication of the datos.bne.es dataset.
Resumo:
In this paper we describe the specification of amodel for the semantically interoperable representation of language resources for sentiment analysis. The model integrates "lemon", an RDF-based model for the specification of ontology-lexica (Buitelaar et al. 2009), which is used increasinglyfor the representation of language resources asLinked Data, with Marl, an RDF-based model for the representation of sentiment annotations (West-erski et al., 2011; Sánchez-Rada et al., 2013)
Resumo:
Within the European Union, member states are setting up official data catalogues as entry points to access PSI (Public Sector Information). In this context, it is important to describe the metadata of these data portals, i.e., of data catalogs, and allow for interoperability among them. To tackle these issues, the Government Linked Data Working Group developed DCAT (Data Catalog Vocabulary), an RDF vocabulary for describing the metadata of data catalogs. This topic report analyzes the current use of the DCAT vocabulary in several European data catalogs and proposes some recommendations to deal with an inconsistent use of the metadata across countries. The enrichment of such metadata vocabularies with multilingual descriptions, as well as an account for cultural divergences, is seen as a necessary step to guarantee interoperability and ensure wider adoption.
Resumo:
The W3C Linked Data Platform (LDP) candidate recom- mendation defines a standard HTTP-based protocol for read/write Linked Data. The W3C R2RML recommendation defines a language to map re- lational databases (RDBs) and RDF. This paper presents morph-LDP, a novel system that combines these two W3C standardization initiatives to expose relational data as read/write Linked Data for LDP-aware ap- plications, whilst allowing legacy applications to continue using their relational databases.
Resumo:
The Web of Data currently comprises ? 62 billion triples from more than 2,000 different datasets covering many fields of knowledge3. This volume of structured Linked Data can be seen as a particular case of Big Data, referred to as Big Semantic Data [4]. Obviously, powerful computational configurations are tradi- tionally required to deal with the scalability problems arising to Big Semantic Data. It is not surprising that this ?data revolution? has competed in parallel with the growth of mobile computing. Smartphones and tablets are massively used at the expense of traditional computers but, to date, mobile devices have more limited computation resources. Therefore, one question that we may ask ourselves would be: can (potentially large) semantic datasets be consumed natively on mobile devices? Currently, only a few mobile apps (e.g., [1, 9, 2, 8]) make use of semantic data that they store in the mobile devices, while many others access existing SPARQL endpoints or Linked Data directly. Two main reasons can be considered for this fact. On the one hand, in spite of some initial approaches [6, 3], there are no well-established triplestores for mobile devices. This is an important limitation because any po- tential app must assume both RDF storage and SPARQL resolution. On the other hand, the particular features of these devices (little storage space, less computational power or more limited bandwidths) limit the adoption of seman- tic data for different uses and purposes. This paper introduces our HDTourist mobile application prototype. It con- sumes urban data from DBpedia4 to help tourists visiting a foreign city. Although it is a simple app, its functionality allows illustrating how semantic data can be stored and queried with limited resources. Our prototype is implemented for An- droid, but its foundations, explained in Section 2, can be deployed in any other platform. The app is described in Section 3, and Section 4 concludes about our current achievements and devises the future work.
Resumo:
Linked Data assets (RDF triples, graphs, datasets, mappings...) can be object of protection by the intellectual property law, the database law or its access or publication be restricted by other legal reasons (personal data pro- tection, security reasons, etc.). Publishing a rights expression along with the digital asset, allows the rightsholder waiving some or all of the IP and database rights (leaving the work in the public domain), permitting some operations if certain conditions are satisfied (like giving attribution to the author) or simply reminding the audience that some rights are reserved.
Resumo:
R2RML is used to specify transformations of data available in relational databases into materialised or virtual RDF datasets. SPARQL queries evaluated against virtual datasets are translated into SQL queries according to the R2RML mappings, so that they can be evaluated over the underlying relational database engines. In this paper we describe an extension of a well-known algorithm for SPARQL to SQL translation, originally formalised for RDBMS-backed triple stores, that takes into account R2RML mappings. We present the result of our implementation using queries from a synthetic benchmark and from three real use cases, and show that SPARQL queries can be in general evaluated as fast as the SQL queries that would have been generated by SQL experts if no R2RML mappings had been used.
Resumo:
This paper presents a Focused Crawler in order to Get Semantic Web Resources (CSR). Structured data web are available in formats such as Extensible Markup Language (XML), Resource Description Framework (RDF) and Ontology Web Language (OWL) that can be used for processing. One of the main challenges for performing a manual search and download semantic web resources is that this task consumes a lot of time. Our research work propose a focused crawler which allow to download these resources automatically and store them on disk in order to have a collection that will be used for data processing. CRS consists of three layers: (a) The User Interface Layer, (b) The Focus Crawler Layer and (c) The Base Crawler Layer. CSR uses as a selection policie the Shark-Search method. CSR was conducted with two experiments. The first one starts on December 15 2012 at 7:11 am and ends on December 16 2012 at 4:01 were obtained 448,123,537 bytes of data. The CSR ends by itself after to analyze 80,4375 seeds with an unlimited depth. CSR got 16,576 semantic resources files where the 89 % was RDF, the 10 % was XML and the 1% was OWL. The second one was based on the Web Data Commons work of the Research Group Data and Web Science at the University of Mannheim and the Institute AIFB at the Karlsruhe Institute of Technology. This began at 4:46 am of June 2 2013 and 1:37 am June 9 2013. After 162.51 hours of execution the result was 285,279 semantic resources where predominated the XML resources with 99 % and OWL and RDF with 1 % each one.
Resumo:
In this paper we present a dataset componsed of domain-specific sentiment lexicons in six languages for two domains. We used existing collections of reviews from Trip Advisor, Amazon, the Stanford Network Analysis Project and the OpinRank Review Dataset. We use an RDF model based on the lemon and Marl formats to represent the lexicons. We describe the methodology that we applied to generate the domain-specific lexicons and we provide access information to our datasets.
Resumo:
Extracting opinions and emotions from text is becoming increasingly important, especially since the advent of micro-blogging and social networking. Opinion mining is particularly popular and now gathers many public services, datasets and lexical resources. Unfortunately, there are few available lexical and semantic resources for emotion recognition that could foster the development of new emotion aware services and applications. The diversity of theories of emotion and the absence of a common vocabulary are two of the main barriers to the development of such resources. This situation motivated the creation of Onyx, a semantic vocabulary of emotions with a focus on lexical resources and emotion analysis services. It follows a linguistic Linked Data approach, it is aligned with the Provenance Ontology, and it has been integrated with the Lexicon Model for Ontologies (lemon), a popular RDF model for representing lexical entries. This approach also means a new and interesting way to work with different theories of emotion. As part of this work, Onyx has been aligned with EmotionML and WordNet-Affect.
Resumo:
En los últimos años ha habido un gran aumento de fuentes de datos biomédicos. La aparición de nuevas técnicas de extracción de datos genómicos y generación de bases de datos que contienen esta información ha creado la necesidad de guardarla para poder acceder a ella y trabajar con los datos que esta contiene. La información contenida en las investigaciones del campo biomédico se guarda en bases de datos. Esto se debe a que las bases de datos permiten almacenar y manejar datos de una manera simple y rápida. Dentro de las bases de datos existen una gran variedad de formatos, como pueden ser bases de datos en Excel, CSV o RDF entre otros. Actualmente, estas investigaciones se basan en el análisis de datos, para a partir de ellos, buscar correlaciones que permitan inferir, por ejemplo, tratamientos nuevos o terapias más efectivas para una determinada enfermedad o dolencia. El volumen de datos que se maneja en ellas es muy grande y dispar, lo que hace que sea necesario el desarrollo de métodos automáticos de integración y homogeneización de los datos heterogéneos. El proyecto europeo p-medicine (FP7-ICT-2009-270089) tiene como objetivo asistir a los investigadores médicos, en este caso de investigaciones relacionadas con el cáncer, proveyéndoles con nuevas herramientas para el manejo de datos y generación de nuevo conocimiento a partir del análisis de los datos gestionados. La ingestión de datos en la plataforma de p-medicine, y el procesamiento de los mismos con los métodos proporcionados, buscan generar nuevos modelos para la toma de decisiones clínicas. Dentro de este proyecto existen diversas herramientas para integración de datos heterogéneos, diseño y gestión de ensayos clínicos, simulación y visualización de tumores y análisis estadístico de datos. Precisamente en el ámbito de la integración de datos heterogéneos surge la necesidad de añadir información externa al sistema proveniente de bases de datos públicas, así como relacionarla con la ya existente mediante técnicas de integración semántica. Para resolver esta necesidad se ha creado una herramienta, llamada Term Searcher, que permite hacer este proceso de una manera semiautomática. En el trabajo aquí expuesto se describe el desarrollo y los algoritmos creados para su correcto funcionamiento. Esta herramienta ofrece nuevas funcionalidades que no existían dentro del proyecto para la adición de nuevos datos provenientes de fuentes públicas y su integración semántica con datos privados.---ABSTRACT---Over the last few years, there has been a huge growth of biomedical data sources. The emergence of new techniques of genomic data generation and data base generation that contain this information, has created the need of storing it in order to access and work with its data. The information employed in the biomedical research field is stored in databases. This is due to the capability of databases to allow storing and managing data in a quick and simple way. Within databases there is a variety of formats, such as Excel, CSV or RDF. Currently, these biomedical investigations are based on data analysis, which lead to the discovery of correlations that allow inferring, for example, new treatments or more effective therapies for a specific disease or ailment. The volume of data handled in them is very large and dissimilar, which leads to the need of developing new methods for automatically integrating and homogenizing the heterogeneous data. The p-medicine (FP7-ICT-2009-270089) European project aims to assist medical researchers, in this case related to cancer research, providing them with new tools for managing and creating new knowledge from the analysis of the managed data. The ingestion of data into the platform and its subsequent processing with the provided tools aims to enable the generation of new models to assist in clinical decision support processes. Inside this project, there exist different tools related to areas such as the integration of heterogeneous data, the design and management of clinical trials, simulation and visualization of tumors and statistical data analysis. Particularly in the field of heterogeneous data integration, there is a need to add external information from public databases, and relate it to the existing ones through semantic integration methods. To solve this need a tool has been created: the term Searcher. This tool aims to make this process in a semiautomatic way. This work describes the development of this tool and the algorithms employed in its operation. This new tool provides new functionalities that did not exist inside the p-medicine project for adding new data from public databases and semantically integrate them with private data.
Resumo:
RDB to RDF Mapping Language (R2RML) es una recomendación del W3C que permite especificar reglas para transformar bases de datos relacionales a RDF. Estos datos en RDF se pueden materializar y almacenar en un sistema gestor de tripletas RDF (normalmente conocidos con el nombre triple store), en el cual se pueden evaluar consultas SPARQL. Sin embargo, hay casos en los cuales la materialización no es adecuada o posible, por ejemplo, cuando la base de datos se actualiza frecuentemente. En estos casos, lo mejor es considerar los datos en RDF como datos virtuales, de tal manera que las consultas SPARQL anteriormente mencionadas se traduzcan a consultas SQL que se pueden evaluar sobre los sistemas gestores de bases de datos relacionales (SGBD) originales. Para esta traducción se tienen en cuenta los mapeos R2RML. La primera parte de esta tesis se centra en la traducción de consultas. Se propone una formalización de la traducción de SPARQL a SQL utilizando mapeos R2RML. Además se proponen varias técnicas de optimización para generar consultas SQL que son más eficientes cuando son evaluadas en sistemas gestores de bases de datos relacionales. Este enfoque se evalúa mediante un benchmark sintético y varios casos reales. Otra recomendación relacionada con R2RML es la conocida como Direct Mapping (DM), que establece reglas fijas para la transformación de datos relacionales a RDF. A pesar de que ambas recomendaciones se publicaron al mismo tiempo, en septiembre de 2012, todavía no se ha realizado un estudio formal sobre la relación entre ellas. Por tanto, la segunda parte de esta tesis se centra en el estudio de la relación entre R2RML y DM. Se divide este estudio en dos partes: de R2RML a DM, y de DM a R2RML. En el primer caso, se estudia un fragmento de R2RML que tiene la misma expresividad que DM. En el segundo caso, se representan las reglas de DM como mapeos R2RML, y también se añade la semántica implícita (relaciones de subclase, 1-N y M-N) que se puede encontrar codificada en la base de datos. Esta tesis muestra que es posible usar R2RML en casos reales, sin necesidad de realizar materializaciones de los datos, puesto que las consultas SQL generadas son suficientemente eficientes cuando son evaluadas en el sistema gestor de base de datos relacional. Asimismo, esta tesis profundiza en el entendimiento de la relación existente entre las dos recomendaciones del W3C, algo que no había sido estudiado con anterioridad. ABSTRACT. RDB to RDF Mapping Language (R2RML) is a W3C recommendation that allows specifying rules for transforming relational databases into RDF. This RDF data can be materialized and stored in a triple store, so that SPARQL queries can be evaluated by the triple store. However, there are several cases where materialization is not adequate or possible, for example, if the underlying relational database is updated frequently. In those cases, RDF data is better kept virtual, and hence SPARQL queries over it have to be translated into SQL queries to the underlying relational database system considering that the translation process has to take into account the specified R2RML mappings. The first part of this thesis focuses on query translation. We discuss the formalization of the translation from SPARQL to SQL queries that takes into account R2RML mappings. Furthermore, we propose several optimization techniques so that the translation procedure generates SQL queries that can be evaluated more efficiently over the underlying databases. We evaluate our approach using a synthetic benchmark and several real cases, and show positive results that we obtained. Direct Mapping (DM) is another W3C recommendation for the generation of RDF data from relational databases. While R2RML allows users to specify their own transformation rules, DM establishes fixed transformation rules. Although both recommendations were published at the same time, September 2012, there has not been any study regarding the relationship between them. The second part of this thesis focuses on the study of the relationship between R2RML and DM. We divide this study into two directions: from R2RML to DM, and from DM to R2RML. From R2RML to DM, we study a fragment of R2RML having the same expressive power than DM. From DM to R2RML, we represent DM transformation rules as R2RML mappings, and also add the implicit semantics encoded in databases, such as subclass, 1-N and N-N relationships. This thesis shows that by formalizing and optimizing R2RML-based SPARQL to SQL query translation, it is possible to use R2RML engines in real cases as the resulting SQL is efficient enough to be evaluated by the underlying relational databases. In addition to that, this thesis facilitates the understanding of bidirectional relationship between the two W3C recommendations, something that had not been studied before.