885 resultados para Semantic Publishing, Linked Data, Bibliometrics, Informetrics, Data Retrieval, Citations
Resumo:
In this paper we study the intersection of Knowledge Organization with Information Technologies and the challenges and opportunities for Knowledge Organization experts that, in our view, are important to be studied and for them to be aware of. We start by giving some definitions necessary for providing the context for our work. Then we review the history of the Web, beginning with the Internet and continuing with the World Wide Web, the Semantic Web, problems of Artificial Intelligence, Web 2.0, and Linked Data. Finally, we conclude our paper with IT applications for Knowledge Organization in libraries, such as FRBR, BIBFRAME, and several OCLC initiatives, as well as with some of the challenges and opportunities in which Knowledge Organization experts and researchers might play a key role in relation to the Semantic Web.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
In this paper we have quantified the consistency of word usage in written texts represented by complex networks, where words were taken as nodes, by measuring the degree of preservation of the node neighborhood. Words were considered highly consistent if the authors used them with the same neighborhood. When ranked according to the consistency of use, the words obeyed a log-normal distribution, in contrast to Zipf's law that applies to the frequency of use. Consistency correlated positively with the familiarity and frequency of use, and negatively with ambiguity and age of acquisition. An inspection of some highly consistent words confirmed that they are used in very limited semantic contexts. A comparison of consistency indices for eight authors indicated that these indices may be employed for author recognition. Indeed, as expected, authors of novels could be distinguished from those who wrote scientific texts. Our analysis demonstrated the suitability of the consistency indices, which can now be applied in other tasks, such as emotion recognition.
Resumo:
La produzione ontologica è un processo fondamentale per la crescita del Web Semantico in quanto le ontologie rappresentano i vocabolari formali con cui strutturare il Web of Data. Le notazioni grafiche ontologiche costituiscono il mezzo ideale per progettare ontologie OWL sensate e ben strutturate. Tuttavia la successiva fase di generazione ontologica richiede all'utente un fastidioso cambio sia di prospettiva sia di strumentazione. Questa tesi propone dunque GraMOS, Graffoo to Manchester OWL Syntax, un motore di trasformazione da modelli Graffoo a ontologie formali in grado di fondere le due fasi di progettazione e generazione ontologica.
Resumo:
Lo scopo del progetto Bird-A è di mettere a disposizione uno strumento basato su ontologie per progettare un'interfaccia web collaborativa di creazione, visualizzazione, modifica e cancellazione di dati RDF e di fornirne una prima implementazione funzionante. La visione che sta muovendo la comunità del web semantico negli ultimi anni è quella di creare un Web basato su dati strutturati tra loro collegati, più che su documenti. Questo modello di architettura prende il nome di Linked Data ed è basata sulla possibilità di considerare cose, concetti, persone come risorse identificabili tramite URI e di poter fornire informazioni e descrivere collegamenti tra queste risorse attraverso l'uso di formati standard come RDF. Ciò che ha però frenato la diffusione di questi dati strutturati ed interconnessi sono stati gli alti requisiti di competenze tecniche necessarie sia alla loro creazione che alla loro fruizione. Il progetto Bird-A si prefigge di semplificare la creazione e la fruizione di dati RDF, favorendone la condivisione e la diffusione anche fra persone non dotate di conoscenze tecniche specifiche.
When that tune runs through your head: A PET investigation of auditory imagery for familiar melodies
Resumo:
The present study used positron emission tomography (PET) to examine the cerebral activity pattern associated with auditory imagery forfamiliar tunes. Subjects either imagined the continuation of nonverbaltunes cued by their first few notes, listened to a short sequence of notesas a control task, or listened and then reimagined that short sequence. Subtraction of the activation in the control task from that in the real-tune imagery task revealed primarily right-sided activation in frontal and superior temporal regions, plus supplementary motor area(SMA). Isolating retrieval of the real tunes by subtracting activation in the reimagine task from that in the real-tune imagery task revealedactivation primarily in right frontal areas and right superior temporal gyrus. Subtraction of activation in the control condition from that in the reimagine condition, intended to capture imagery of unfamiliarsequences, revealed activation in SMA, plus some left frontal regions. We conclude that areas of right auditory association cortex, together with right and left frontal cortices, are implicated in imagery for familiartunes, in accord with previous behavioral, lesion and PET data. Retrieval from musical semantic memory is mediated by structures in the right frontal lobe, in contrast to results from previous studies implicating left frontal areas for all semantic retrieval. The SMA seems to be involved specifically in image generation, implicating a motor code in this process.
When that tune runs through your head: a PET investigation of auditory imagery for familiar melodies
Resumo:
The present study used positron emission tomography (PET) to examine the cerebral activity pattern associated with auditory imagery for familiar tunes. Subjects either imagined the continuation of nonverbal tunes cued by their first few notes, listened to a short sequence of notes as a control task, or listened and then reimagined that short sequence. Subtraction of the activation in the control task from that in the real-tune imagery task revealed primarily right-sided activation in frontal and superior temporal regions, plus supplementary motor area (SMA). Isolating retrieval of the real tunes by subtracting activation in the reimagine task from that in the real-tune imagery task revealed activation primarily in right frontal areas and right superior temporal gyrus. Subtraction of activation in the control condition from that in the reimagine condition, intended to capture imagery of unfamiliar sequences, revealed activation in SMA, plus some left frontal regions. We conclude that areas of right auditory association cortex, together with right and left frontal cortices, are implicated in imagery for familiar tunes, in accord with previous behavioral, lesion and PET data. Retrieval from musical semantic memory is mediated by structures in the right frontal lobe, in contrast to results from previous studies implicating left frontal areas for all semantic retrieval. The SMA seems to be involved specifically in image generation, implicating a motor code in this process.
Resumo:
Enriching knowledge bases with multimedia information makes it possible to complement textual descriptions with visual and audio information. Such complementary information can help users to understand the meaning of assertions, and in general improve the user experience with the knowledge base. In this paper we address the problem of how to enrich ontology instances with candidate images retrieved from existing Web search engines. DBpedia has evolved into a major hub in the Linked Data cloud, interconnecting millions of entities organized under a consistent ontology. Our approach taps into the Wikipedia corpus to gather context information for DBpedia instances and takes advantage of image tagging information when this is available to calculate semantic relatedness between instances and candidate images. We performed experiments with focus on the particularly challenging problem of highly ambiguous names. Both methods presented in this work outperformed the baseline. Our best method leveraged context words from Wikipedia, tags from Flickr and type information from DBpedia to achieve an average precision of 80%.
Resumo:
The goal of the W3C's Media Annotation Working Group (MAWG) is to promote interoperability between multimedia metadata formats on the Web. As experienced by everybody, audiovisual data is omnipresent on today's Web. However, different interaction interfaces and especially diverse metadata formats prevent unified search, access, and navigation. MAWG has addressed this issue by developing an interlingua ontology and an associated API. This article discusses the rationale and core concepts of the ontology and API for media resources. The specifications developed by MAWG enable interoperable contextualized and semantic annotation and search, independent of the source metadata format, and connecting multimedia data to the Linked Data cloud. Some demonstrators of such applications are also presented in this article.
Resumo:
Twitter lists organise Twitter users into multiple, often overlapping, sets. We believe that these lists capture some form of emergent semantics, which may be useful to characterise. In this paper we describe an approach for such characterisation, which consists of deriving semantic relations between lists and users by analyzing the cooccurrence of keywords in list names. We use the vector space model and Latent Dirichlet Allocation to obtain similar keywords according to co-occurrence patterns. These results are then compared to similarity measures relying on WordNet and to existing Linked Data sets. Results show that co-occurrence of keywords based on members of the lists produce more synonyms and more correlated results to that of WordNet similarity measures.
Resumo:
Nowadays, there is a significant quantity of linguistic data available on the Web. However, linguistic resources are often published using proprietary formats and, as such, it can be difficult to interface with one another and they end up confined in “data silos”. The creation of web standards for the publishing of data on the Web and projects to create Linked Data have lead to interest in the creation of resources that can be published using Web principles. One of the most important aspects of “Lexical Linked Data” is the sharing of lexica and machine readable dictionaries. It is for this reason, that the lemon format has been proposed, which we briefly describe. We then consider two resources that seem ideal candidates for the Linked Data cloud, namely WordNet 3.0 and Wiktionary, a large document based dictionary. We discuss the challenges of converting both resources to lemon , and in particular for Wiktionary, the challenge of processing the mark-up, and handling inconsistencies and underspecification in the source material. Finally, we turn to the task of creating links between the two resources and present a novel algorithm for linking lexica as lexical Linked Data.
Resumo:
Estamos viviendo la era de la Internetificación. A día de hoy, las conexiones a Internet se asumen presentes en nuestro entorno como una necesidad más. La Web, se ha convertido en un lugar de generación de contenido por los usuarios. Una información generada, que sobrepasa la idea con la que surgió esta, ya que en la mayoría de casos, su contenido no se ha diseñado más que para ser consumido por humanos, y no por máquinas. Esto supone un cambio de mentalidad en la forma en que diseñamos sistemas capaces de soportar una carga computacional y de almacenamiento que crece sin un fin aparente. Al mismo tiempo, vivimos un momento de crisis de la educación superior: los altos costes de una educación de calidad suponen una amenaza para el mundo académico. Mediante el uso de la tecnología, se puede lograr un incremento de la productividad, y una reducción en dichos costes en un campo, en el que apenas se ha avanzado desde el Renacimiento. En CloudRoom se ha diseñado una plataforma MOOC con una arquitectura ajustada a las últimas convenciones en Cloud Computing, que implica el uso de Servicios REST, bases de datos NoSQL, y que hace uso de las últimas recomendaciones del W3C en materia de desarrollo web y Linked Data. Para su construcción, se ha hecho uso de métodos ágiles de Ingeniería del Software, técnicas de Interacción Persona-Ordenador, y tecnologías de última generación como Neo4j, Redis, Node.js, AngularJS, Bootstrap, HTML5, CSS3 o Amazon Web Services. Se ha realizado un trabajo integral de Ingeniería Informática, combinando prácticamente la totalidad de aquellas áreas de conocimiento fundamentales en Informática. En definitiva se han ideado las bases de un sistema distribuido robusto, mantenible, con características sociales y semánticas, que puede ser ejecutado en múltiples dispositivos, y que es capaz de responder ante millones de usuarios. We are living through an age of Internetification. Nowadays, Internet connections are a utility whose presence one can simply assume. The web has become a place of generation of content by users. The information generated surpasses the notion with which the World Wide Web emerged because, in most cases, this content has been designed to be consumed by humans and not by machines. This fact implies a change of mindset in the way that we design systems; these systems should be able to support a computational and storage capacity that apparently grows endlessly. At the same time, our education system is in a state of crisis: the high costs of high-quality education threaten the academic world. With the use of technology, we could achieve an increase of productivity and quality, and a reduction of these costs in this field, which has remained largely unchanged since the Renaissance. In CloudRoom, a MOOC platform has been designed with an architecture that satisfies the last conventions on Cloud Computing; which involves the use of REST services, NoSQL databases, and uses the last recommendations from W3C in terms of web development and Linked Data. For its building process, agile methods of Software Engineering, Human-Computer Interaction techniques, and state of the art technologies such as Neo4j, Redis, Node.js, AngularJS, Bootstrap, HTML5, CSS3 or Amazon Web Services have been used. Furthermore, a comprehensive Informatics Engineering work has been performed, by combining virtually all of the areas of knowledge in Computer Science. Summarizing, the pillars of a robust, maintainable, and distributed system have been devised; a system with social and semantic capabilities, which runs in multiple devices, and scales to millions of users.
Resumo:
Many progresses have been made since the Digital Earth notion was envisioned thirteen years ago. However, the mechanism for integrating geographic information into the Digital Earth is still quite limited. In this context, we have developed a process to generate, integrate and publish geospatial Linked Data from several Spanish National data-sets. These data-sets are related to four Infrastructure for Spatial Information in the European Community (INSPIRE) themes, specifically with Administrative units, Hydrography, Statistical units, and Meteorology. Our main goal is to combine different sources (heterogeneous, multidisciplinary, multitemporal, multiresolution, and multilingual) using Linked Data principles. This goal allows the overcoming of current problems of information integration and driving geographical information toward the next decade scenario, that is, ?Linked Digital Earth.?
Resumo:
One of the challenges facing the current web is the efficient use of all the available information. The Web 2.0 phenomenon has favored the creation of contents by average users, and thus the amount of information that can be found for diverse topics has grown exponentially in the last years. Initiatives such as linked data are helping to build the Semantic Web, in which a set of standards are proposed for the exchange of data among heterogeneous systems. However, these standards are sometimes not used, and there are still plenty of websites that require naive techniques to discover their contents and services. This paper proposes an integrated framework for content and service discovery and extraction. The framework is divided into several layers where the discovery of contents and services is made in a representational stateless transfer system such as the web. It employs several web mining techniques as well as feature-oriented modeling for the discovery of cross-cutting features in web resources. The framework is used in a scenario of electronic newspapers. An intelligent agent crawls the web for related news, and uses services and visits links automatically according to its goal. This scenario illustrates how the discovery is made at different levels and how the use of semantics helps implement an agent that performs high-level tasks.
Resumo:
Background: Semantic Web technologies have been widely applied in the life sciences, for example by data providers such as OpenLifeData and through web services frameworks such as SADI. The recently reported OpenLifeData2SADI project offers access to the vast OpenLifeData data store through SADI services. Findings: This article describes how to merge data retrieved from OpenLifeData2SADI with other SADI services using the Galaxy bioinformatics analysis platform, thus making this semantic data more amenable to complex analyses. This is demonstrated using a working example, which is made distributable and reproducible through a Docker image that includes SADI tools, along with the data and workflows that constitute the demonstration. Conclusions: The combination of Galaxy and Docker offers a solution for faithfully reproducing and sharing complex data retrieval and analysis workflows based on the SADI Semantic web service design patterns.