10 resultados para SemanticWeb
Resumo:
Dissertação para obtenção do Grau de Mestre em Engenharia Informática
Resumo:
Dissertação para obtenção do Grau de Mestre em Engenharia Electrotécnica e de Computadores
Resumo:
Hybrid knowledge bases are knowledge bases that combine ontologies with non-monotonic rules, allowing to join the best of both open world ontologies and close world rules. Ontologies shape a good mechanism to share knowledge on theWeb that can be understood by both humans and machines, on the other hand rules can be used, e.g., to encode legal laws or to do a mapping between sources of information. Taking into account the dynamics present today on the Web, it is important for these hybrid knowledge bases to capture all these dynamics and thus adapt themselves. To achieve that, it is necessary to create mechanisms capable of monitoring the information flow present on theWeb. Up to today, there are no such mechanisms that allow for monitoring events and performing modifications of hybrid knowledge bases autonomously. The goal of this thesis is then to create a system that combine these hybrid knowledge bases with reactive rules, aiming to monitor events and perform actions over a knowledge base. To achieve this goal, a reactive system for the SemanticWeb is be developed in a logic-programming based approach accompanied with a language for heterogeneous rule base evolution having as its basis RIF Production Rule Dialect, which is a standard for exchanging rules over theWeb.
Resumo:
Lo scopo della tesi è definire un modello e identificare un sistema d’inferenza utile per l’analisi della qualità dei dati. Partendo da quanto descritto in ambito accademico e business, la definizione di un modello facilita l’analisi della qualità, fornendo una descrizione chiara delle tipologie di problemi a cui possono essere soggetti i dati. I diversi lavori in ambito accademico e business saranno confrontati per stabilire quali siano i problemi di qualità più diffusi, in modo da realizzare un modello che sia semplice e riutilizzabile. I sistemi d’inferenza saranno confrontati a livello teorico e pratico per individuare lo strumento più adatto nell’analisi della qualità dei dati in un caso applicativo. Il caso applicativo è caratterizzato da requisiti funzionali e non; il principale requisito funzionale è l’individuazione di problemi di qualità nei dati, mentre quello non funzionale è l’ usabilità dello strumento, per permettere ad un qualunque utente di esprimere dei controlli sulla qualità. Il caso applicativo considera dati di un’enterprise architecture reale ed è stato fornito dall’azienda Imola Informatica.
Resumo:
This PhD thesis contributes to the problem of resource and service discovery in the context of the composable web. In the current web, mashup technologies allow developers reusing services and contents to build new web applications. However, developers face a problem of information flood when searching for appropriate services or resources for their combination. To contribute to overcoming this problem, a framework is defined for the discovery of services and resources. In this framework, three levels are defined for performing discovery at content, discovery and agente levels. The content level involves the information available in web resources. The web follows the Representational Stateless Transfer (REST) architectural style, in which resources are returned as representations from servers to clients. These representations usually employ the HyperText Markup Language (HTML), which, along with Content Style Sheets (CSS), describes the markup employed to render representations in a web browser. Although the use of SemanticWeb standards such as Resource Description Framework (RDF) make this architecture suitable for automatic processes to use the information present in web resources, these standards are too often not employed, so automation must rely on processing HTML. This process, often referred as Screen Scraping in the literature, is the content discovery according to the proposed framework. At this level, discovery rules indicate how the different pieces of data in resources’ representations are mapped onto semantic entities. By processing discovery rules on web resources, semantically described contents can be obtained out of them. The service level involves the operations that can be performed on the web. The current web allows users to perform different tasks such as search, blogging, e-commerce, or social networking. To describe the possible services in RESTful architectures, a high-level feature-oriented service methodology is proposed at this level. This lightweight description framework allows defining service discovery rules to identify operations in interactions with REST resources. The discovery is thus performed by applying discovery rules to contents discovered in REST interactions, in a novel process called service probing. Also, service discovery can be performed by modelling services as contents, i.e., by retrieving Application Programming Interface (API) documentation and API listings in service registries such as ProgrammableWeb. For this, a unified model for composable components in Mashup-Driven Development (MDD) has been defined after the analysis of service repositories from the web. The agent level involves the orchestration of the discovery of services and contents. At this level, agent rules allow to specify behaviours for crawling and executing services, which results in the fulfilment of a high-level goal. Agent rules are plans that allow introspecting the discovered data and services from the web and the knowledge present in service and content discovery rules to anticipate the contents and services to be found on specific resources from the web. By the definition of plans, an agent can be configured to target specific resources. The discovery framework has been evaluated on different scenarios, each one covering different levels of the framework. Contenidos a la Carta project deals with the mashing-up of news from electronic newspapers, and the framework was used for the discovery and extraction of pieces of news from the web. Similarly, in Resulta and VulneraNET projects the discovery of ideas and security knowledge in the web is covered, respectively. The service level is covered in the OMELETTE project, where mashup components such as services and widgets are discovered from component repositories from the web. The agent level is applied to the crawling of services and news in these scenarios, highlighting how the semantic description of rules and extracted data can provide complex behaviours and orchestrations of tasks in the web. The main contributions of the thesis are the unified framework for discovery, which allows configuring agents to perform automated tasks. Also, a scraping ontology has been defined for the construction of mappings for scraping web resources. A novel first-order logic rule induction algorithm is defined for the automated construction and maintenance of these mappings out of the visual information in web resources. Additionally, a common unified model for the discovery of services is defined, which allows sharing service descriptions. Future work comprises the further extension of service probing, resource ranking, the extension of the Scraping Ontology, extensions of the agent model, and contructing a base of discovery rules. Resumen La presente tesis doctoral contribuye al problema de descubrimiento de servicios y recursos en el contexto de la web combinable. En la web actual, las tecnologías de combinación de aplicaciones permiten a los desarrolladores reutilizar servicios y contenidos para construir nuevas aplicaciones web. Pese a todo, los desarrolladores afrontan un problema de saturación de información a la hora de buscar servicios o recursos apropiados para su combinación. Para contribuir a la solución de este problema, se propone un marco de trabajo para el descubrimiento de servicios y recursos. En este marco, se definen tres capas sobre las que se realiza descubrimiento a nivel de contenido, servicio y agente. El nivel de contenido involucra a la información disponible en recursos web. La web sigue el estilo arquitectónico Representational Stateless Transfer (REST), en el que los recursos son devueltos como representaciones por parte de los servidores a los clientes. Estas representaciones normalmente emplean el lenguaje de marcado HyperText Markup Language (HTML), que, unido al estándar Content Style Sheets (CSS), describe el marcado empleado para mostrar representaciones en un navegador web. Aunque el uso de estándares de la web semántica como Resource Description Framework (RDF) hace apta esta arquitectura para su uso por procesos automatizados, estos estándares no son empleados en muchas ocasiones, por lo que cualquier automatización debe basarse en el procesado del marcado HTML. Este proceso, normalmente conocido como Screen Scraping en la literatura, es el descubrimiento de contenidos en el marco de trabajo propuesto. En este nivel, un conjunto de reglas de descubrimiento indican cómo los diferentes datos en las representaciones de recursos se corresponden con entidades semánticas. Al procesar estas reglas sobre recursos web, pueden obtenerse contenidos descritos semánticamente. El nivel de servicio involucra las operaciones que pueden ser llevadas a cabo en la web. Actualmente, los usuarios de la web pueden realizar diversas tareas como búsqueda, blogging, comercio electrónico o redes sociales. Para describir los posibles servicios en arquitecturas REST, se propone en este nivel una metodología de alto nivel para descubrimiento de servicios orientada a funcionalidades. Este marco de descubrimiento ligero permite definir reglas de descubrimiento de servicios para identificar operaciones en interacciones con recursos REST. Este descubrimiento es por tanto llevado a cabo al aplicar las reglas de descubrimiento sobre contenidos descubiertos en interacciones REST, en un nuevo procedimiento llamado sondeo de servicios. Además, el descubrimiento de servicios puede ser llevado a cabo mediante el modelado de servicios como contenidos. Es decir, mediante la recuperación de documentación de Application Programming Interfaces (APIs) y listas de APIs en registros de servicios como ProgrammableWeb. Para ello, se ha definido un modelo unificado de componentes combinables para Mashup-Driven Development (MDD) tras el análisis de repositorios de servicios de la web. El nivel de agente involucra la orquestación del descubrimiento de servicios y contenidos. En este nivel, las reglas de nivel de agente permiten especificar comportamientos para el rastreo y ejecución de servicios, lo que permite la consecución de metas de mayor nivel. Las reglas de los agentes son planes que permiten la introspección sobre los datos y servicios descubiertos, así como sobre el conocimiento presente en las reglas de descubrimiento de servicios y contenidos para anticipar contenidos y servicios por encontrar en recursos específicos de la web. Mediante la definición de planes, un agente puede ser configurado para descubrir recursos específicos. El marco de descubrimiento ha sido evaluado sobre diferentes escenarios, cada uno cubriendo distintos niveles del marco. El proyecto Contenidos a la Carta trata de la combinación de noticias de periódicos digitales, y en él el framework se ha empleado para el descubrimiento y extracción de noticias de la web. De manera análoga, en los proyectos Resulta y VulneraNET se ha llevado a cabo un descubrimiento de ideas y de conocimientos de seguridad, respectivamente. El nivel de servicio se cubre en el proyecto OMELETTE, en el que componentes combinables como servicios y widgets se descubren en repositorios de componentes de la web. El nivel de agente se aplica al rastreo de servicios y noticias en estos escenarios, mostrando cómo la descripción semántica de reglas y datos extraídos permiten proporcionar comportamientos complejos y orquestaciones de tareas en la web. Las principales contribuciones de la tesis son el marco de trabajo unificado para descubrimiento, que permite configurar agentes para realizar tareas automatizadas. Además, una ontología de extracción ha sido definida para la construcción de correspondencias y extraer información de recursos web. Asimismo, un algoritmo para la inducción de reglas de lógica de primer orden se ha definido para la construcción y el mantenimiento de estas correspondencias a partir de la información visual de recursos web. Adicionalmente, se ha definido un modelo común y unificado para el descubrimiento de servicios que permite la compartición de descripciones de servicios. Como trabajos futuros se considera la extensión del sondeo de servicios, clasificación de recursos, extensión de la ontología de extracción y la construcción de una base de reglas de descubrimiento.
Resumo:
Interoperability between semantic technologies is a must because they need to be in communication to interchange ontologies and use them in the distributed and open environment of the SemanticWeb. However, such interoperability is not straightforward due to the high heterogeneity in such technologies. This chapter describes the problem of semantic technology interoperability from two different perspectives. First, from a theoretical perspective by presenting an overview of the different factors that affect interoperability and, second, from a practical perspective by reusing evaluation methods and applying them to six current semantic technologies in order to assess their interoperability.
Resumo:
In this demo paper we describe an iOS-based application that allows visualizing live bus transport data in Madrid from static and streaming RDF endpoints, reusing the Web services provided by the bus transport authority in the city and wrapping them using SPARQLStream
Resumo:
Rights expression languages declare the permitted and prohibited actions to be performed on a resource. Along this work, six rights expression languages are compared, abstracting their commonalities and outlining their underlying pattern. Linked Data, which can be object of protection by the intellectual property laws or its access be restricted by an access control system, can be the asset in rights expressions. The requirements for a pattern for licensing Linked Data resources are listed.
Resumo:
Despite expectations being high, the industrial take-up of Semantic Web technologies in developing services and applications has been slower than expected. One of the main reasons is that many legacy systems have been developed without considering the potential of theWeb in integrating services and sharing resources.Without a systematic methodology and proper tool support, the migration from legacy systems to SemanticWeb Service-based systems can be a tedious and expensive process, which carries a significant risk of failure. There is an urgent need to provide strategies, allowing the migration of legacy systems to Semantic Web Services platforms, and also tools to support such strategies. In this paper we propose a methodology and its tool support for transitioning these applications to Semantic Web Services, which allow users to migrate their applications to Semantic Web Services platforms automatically or semi-automatically. The transition of the GATE system is used as a case study. © 2009 - IOS Press and the authors. All rights reserved.
Resumo:
The Semantic Web relies on carefully structured, well defined, data to allow machines to communicate and understand one another. In many domains (e.g. geospatial) the data being described contains some uncertainty, often due to incomplete knowledge; meaningful processing of this data requires these uncertainties to be carefully analysed and integrated into the process chain. Currently, within the SemanticWeb there is no standard mechanism for interoperable description and exchange of uncertain information, which renders the automated processing of such information implausible, particularly where error must be considered and captured as it propagates through a processing sequence. In particular we adopt a Bayesian perspective and focus on the case where the inputs / outputs are naturally treated as random variables. This paper discusses a solution to the problem in the form of the Uncertainty Markup Language (UncertML). UncertML is a conceptual model, realised as an XML schema, that allows uncertainty to be quantified in a variety of ways i.e. realisations, statistics and probability distributions. UncertML is based upon a soft-typed XML schema design that provides a generic framework from which any statistic or distribution may be created. Making extensive use of Geography Markup Language (GML) dictionaries, UncertML provides a collection of definitions for common uncertainty types. Containing both written descriptions and mathematical functions, encoded as MathML, the definitions within these dictionaries provide a robust mechanism for defining any statistic or distribution and can be easily extended. Universal Resource Identifiers (URIs) are used to introduce semantics to the soft-typed elements by linking to these dictionary definitions. The INTAMAP (INTeroperability and Automated MAPping) project provides a use case for UncertML. This paper demonstrates how observation errors can be quantified using UncertML and wrapped within an Observations & Measurements (O&M) Observation. The interpolation service uses the information within these observations to influence the prediction outcome. The output uncertainties may be encoded in a variety of UncertML types, e.g. a series of marginal Gaussian distributions, a set of statistics, such as the first three marginal moments, or a set of realisations from a Monte Carlo treatment. Quantifying and propagating uncertainty in this way allows such interpolation results to be consumed by other services. This could form part of a risk management chain or a decision support system, and ultimately paves the way for complex data processing chains in the Semantic Web.