283 resultados para RDF triples
Resumo:
This thesis aims at investigating methods and software architectures for discovering what are the typical and frequently occurring structures used for organizing knowledge in the Web. We identify these structures as Knowledge Patterns (KPs). KP discovery needs to address two main research problems: the heterogeneity of sources, formats and semantics in the Web (i.e., the knowledge soup problem) and the difficulty to draw relevant boundary around data that allows to capture the meaningful knowledge with respect to a certain context (i.e., the knowledge boundary problem). Hence, we introduce two methods that provide different solutions to these two problems by tackling KP discovery from two different perspectives: (i) the transformation of KP-like artifacts to KPs formalized as OWL2 ontologies; (ii) the bottom-up extraction of KPs by analyzing how data are organized in Linked Data. The two methods address the knowledge soup and boundary problems in different ways. The first method provides a solution to the two aforementioned problems that is based on a purely syntactic transformation step of the original source to RDF followed by a refactoring step whose aim is to add semantics to RDF by select meaningful RDF triples. The second method allows to draw boundaries around RDF in Linked Data by analyzing type paths. A type path is a possible route through an RDF that takes into account the types associated to the nodes of a path. Then we present K~ore, a software architecture conceived to be the basis for developing KP discovery systems and designed according to two software architectural styles, i.e, the Component-based and REST. Finally we provide an example of reuse of KP based on Aemoo, an exploratory search tool which exploits KPs for performing entity summarization.
Resumo:
OntoTag - A Linguistic and Ontological Annotation Model Suitable for the Semantic Web
1. INTRODUCTION. LINGUISTIC TOOLS AND ANNOTATIONS: THEIR LIGHTS AND SHADOWS
Computational Linguistics is already a consolidated research area. It builds upon the results of other two major ones, namely Linguistics and Computer Science and Engineering, and it aims at developing computational models of human language (or natural language, as it is termed in this area). Possibly, its most well-known applications are the different tools developed so far for processing human language, such as machine translation systems and speech recognizers or dictation programs.
These tools for processing human language are commonly referred to as linguistic tools. Apart from the examples mentioned above, there are also other types of linguistic tools that perhaps are not so well-known, but on which most of the other applications of Computational Linguistics are built. These other types of linguistic tools comprise POS taggers, natural language parsers and semantic taggers, amongst others. All of them can be termed linguistic annotation tools.
Linguistic annotation tools are important assets. In fact, POS and semantic taggers (and, to a lesser extent, also natural language parsers) have become critical resources for the computer applications that process natural language. Hence, any computer application that has to analyse a text automatically and ‘intelligently’ will include at least a module for POS tagging. The more an application needs to ‘understand’ the meaning of the text it processes, the more linguistic tools and/or modules it will incorporate and integrate.
However, linguistic annotation tools have still some limitations, which can be summarised as follows:
1. Normally, they perform annotations only at a certain linguistic level (that is, Morphology, Syntax, Semantics, etc.).
2. They usually introduce a certain rate of errors and ambiguities when tagging. This error rate ranges from 10 percent up to 50 percent of the units annotated for unrestricted, general texts.
3. Their annotations are most frequently formulated in terms of an annotation schema designed and implemented ad hoc.
A priori, it seems that the interoperation and the integration of several linguistic tools into an appropriate software architecture could most likely solve the limitations stated in (1). Besides, integrating several linguistic annotation tools and making them interoperate could also minimise the limitation stated in (2). Nevertheless, in the latter case, all these tools should produce annotations for a common level, which would have to be combined in order to correct their corresponding errors and inaccuracies. Yet, the limitation stated in (3) prevents both types of integration and interoperation from being easily achieved.
In addition, most high-level annotation tools rely on other lower-level annotation tools and their outputs to generate their own ones. For example, sense-tagging tools (operating at the semantic level) often use POS taggers (operating at a lower level, i.e., the morphosyntactic) to identify the grammatical category of the word or lexical unit they are annotating. Accordingly, if a faulty or inaccurate low-level annotation tool is to be used by other higher-level one in its process, the errors and inaccuracies of the former should be minimised in advance. Otherwise, these errors and inaccuracies would be transferred to (and even magnified in) the annotations of the high-level annotation tool.
Therefore, it would be quite useful to find a way to
(i) correct or, at least, reduce the errors and the inaccuracies of lower-level linguistic tools;
(ii) unify the annotation schemas of different linguistic annotation tools or, more generally speaking, make these tools (as well as their annotations) interoperate.
Clearly, solving (i) and (ii) should ease the automatic annotation of web pages by means of linguistic tools, and their transformation into Semantic Web pages (Berners-Lee, Hendler and Lassila, 2001). Yet, as stated above, (ii) is a type of interoperability problem. There again, ontologies (Gruber, 1993; Borst, 1997) have been successfully applied thus far to solve several interoperability problems. Hence, ontologies should help solve also the problems and limitations of linguistic annotation tools aforementioned.
Thus, to summarise, the main aim of the present work was to combine somehow these separated approaches, mechanisms and tools for annotation from Linguistics and Ontological Engineering (and the Semantic Web) in a sort of hybrid (linguistic and ontological) annotation model, suitable for both areas. This hybrid (semantic) annotation model should (a) benefit from the advances, models, techniques, mechanisms and tools of these two areas; (b) minimise (and even solve, when possible) some of the problems found in each of them; and (c) be suitable for the Semantic Web. The concrete goals that helped attain this aim are presented in the following section.
2. GOALS OF THE PRESENT WORK
As mentioned above, the main goal of this work was to specify a hybrid (that is, linguistically-motivated and ontology-based) model of annotation suitable for the Semantic Web (i.e. it had to produce a semantic annotation of web page contents). This entailed that the tags included in the annotations of the model had to (1) represent linguistic concepts (or linguistic categories, as they are termed in ISO/DCR (2008)), in order for this model to be linguistically-motivated; (2) be ontological terms (i.e., use an ontological vocabulary), in order for the model to be ontology-based; and (3) be structured (linked) as a collection of ontology-based
Resumo:
OGOLOD is a Linked Open Data dataset derived from different biomedical resources by an automated pipeline, using a tailored ontology as a scaffold. The key contribution of OGOLOD is that it links, in new RDF triples, genetic human diseases and orthologous genes, paving the way for a more efficient translational biomedical research exploiting the Linked Open Data cloud.
Resumo:
Linked Data assets (RDF triples, graphs, datasets, mappings...) can be object of protection by the intellectual property law, the database law or its access or publication be restricted by other legal reasons (personal data pro- tection, security reasons, etc.). Publishing a rights expression along with the digital asset, allows the rightsholder waiving some or all of the IP and database rights (leaving the work in the public domain), permitting some operations if certain conditions are satisfied (like giving attribution to the author) or simply reminding the audience that some rights are reserved.
Resumo:
In 2005, the University of Maryland acquired over 70 digital videos spanning 35 years of Jim Henson’s groundbreaking work in television and film. To support in-house discovery and use, the collection was cataloged in detail using AACR2 and MARC21, and a web-based finding aid was also created. In the past year, I created an "r-ball" (a linked data set described using RDA) of these same resources. The presentation will compare and contrast these three ways of accessing the Jim Henson Works collection, with insights gleaned from providing resource discovery using RIMMF (RDA in Many Metadata Formats).
Resumo:
This paper introduces a semantic language developed with the objective to be used in a semantic analyzer based on linguistic and world knowledge. Linguistic knowledge is provided by a Combinatorial Dictionary and several sets of rules. Extra-linguistic information is stored in an Ontology. The meaning of the text is represented by means of a series of RDF-type triples of the form predicate (subject, object). Semantic analyzer is one of the options of the multifunctional ETAP-3 linguistic processor. The analyzer can be used for Information Extraction and Question Answering. We describe semantic representation of expressions that provide an assessment of the number of objects involved and/or give a quantitative evaluation of different types of attributes. We focus on the following aspects: 1) parametric and non-parametric attributes; 2) gradable and non-gradable attributes; 3) ontological representation of different classes of attributes; 4) absolute and relative quantitative assessment; 5) punctual and interval quantitative assessment; 6) intervals with precise and fuzzy boundaries
Resumo:
垃圾衍生燃料 (Refuse-Derived Fuels-RDF)具有热值高、易燃烧的优点。RDF的一个潜在应用是与煤进行混烧,替代一部分锅炉燃烧用煤。由于RDF挥发份相当高,因此燃烧时的污染物排放不易控制。本文在非均匀布风流化床中进行了RDF与煤的混烧试验,测量了H_2O、CO、CO_2、NO、N_2O、HCl、SO_2等污染物质的排放特性。结果表明与单纯燃烧RDF相比,混烧时的CO生成量大大下降;SO2生成浓度较低,而HCl的生成量比单纯烧煤时明显增加。
Resumo:
An internally circulating fluidized bed (ICFB) was applied to investigate the behavior of chlorine and sulfur during cofiring RDF and coal. The pollutant emissions in the flue gas were measured by Fourier transform infrared (FTIR) spectrometry (Gasmet DX-3000). In the tests, the concentrations of the species CO, CO2, HCl, and SO2 were measured online. Results indicated when cofiring RDF and char, due to the higher content of chlorine in RDF, the formation of HCl significantly increases. The concentration of SO2 is relatively low because alkaline metal in the fuel ash can absorb SO2. The concentration of CO emission during firing pure RDF is relatively higher and fluctuates sharply. With the CaO addition, the sulfur absorption by calcium quickly increases, and the desulfuration ratio is bigger than the dechlorination ratio. The chemical equilibrium method is applied to predict the behavior of chlorine. Results show that gaseous HCl emission increases with increasing RDF fraction, and gaseous KCl and NaCl formation might occur.
Resumo:
An internally circulating fluidized bed (ICFB) was applied to investigate the behavior of chlorine and sulfur during cofiring RDF and coal. The pollutant emissions in the flue gas were measured by Fourier transform infrared (FTIR) spectrometry (Gasmet DX-3000). In the tests, the concentrations of the species CO, CO2, HCl, and SO2 were measured online. Results indicated when cofiring RDF and char, due to the higher content of chlorine in RDF, the formation of HCl significantly increases. The concentration Of SO2 is relatively low because alkaline metal in the fuel ash can absorb SO2. The concentration of CO emission during firing pure RDF is relatively higher and fluctuates sharply. With the CaO addition, the sulfur absorption by calcium quickly increases, and the desulfuration ratio is bigger than the dechlorination ratio. The chemical equilibrium method is applied to predict the behavior of chlorine. Results show that gaseous HCl emission increases with increasing RDF fraction, and gaseous KCl and NaCl formation might occur.
Resumo:
La primera idea de la realización de este proyecto, fue concebida por la necesidad de tener un sistema por el cual se pudieran cambiar datos de una aplicación, en un sistema móvil, a través de una página web. Sin embargo al conocer la potencia que tiene RDF para ser muy escalable terminó siendo un sistema de gestión de contenido general en RDF. Este sistema de gestión se ha realizado para ser lo más simple posible para un usuario, de tal manera que con solo 2 click en la página web y rellenando un formulario simple, pudiera tener una base de datos sin muchos conocimientos sobre la gestión de las mismas. Aunque claramente no es un sistema potente como si fuera una base de datos en Oracle, por citar un ejemplo, sirve para poder agregar, modificar y eliminar datos con sencillez. Así este sistema de gestión da una posibilidad muy sencilla de realizar tus propias bases de datos. Además aunque tiene un motor SQL para la gestión interna de almacenamiento, la salida de los datos es en RDF/XML con lo que podría ser compatible con un sistema más amplio como Oracle Database Semantic Technologies. Este CMS también tendrá un sistema de seguridad basado en usuario y contraseña. Para que la edición del contenido sea accesible solo a usuarios con acceso, mientras que la exportación de los datos será pública, y podrá ser accesible por cualquier usuario mediante una URI.
Resumo:
随着越来越多的信息被表示为RDF格式,如何高效地对RDF信息进行分发和过滤成为一个重要的问题.在语义Web环境下的信息分发系统中,输入的RDF信息需要和大量的用户订阅条件进行匹配,而用户的订阅条件可以被表示为RDF图模式.根据RDF图的特点,并对其增加了一些约束.设计了一种新的RDF图模式匹配算法.实验结果表明,该算法的匹配效率远远高于传统的图模式匹配算法.