8 resultados para Electronic information resources
em Universidad Politécnica de Madrid
Resumo:
Cultural heritage is a complex and diverse concept, which brings together a wide domain of information. Resources linked to a cultural heritage site may consist of physical artefacts, books, works of art, pictures, historical maps, aerial photographs, archaeological surveys and 3D models. Moreover, all these resources are listed and described by a set of a variety of metadata specifications that allow their online search and consultation on the most basic characteristics of them. Some examples include Norma ISO 19115, Dublin Core, AAT, CDWA, CCO, DACS, MARC, MoReq, MODS, MuseumDat, TGN, SPECTRUM, VRA Core and Z39.50. Gateways are in place to fit in these metadata standards into those used in a SDI (ISO 19115 or INSPIRE), but substantial work still remains to be done for the complete incorporation of cultural heritage information. Therefore, the aim of this paper is to demonstrate how the complexity of cultural heritage resources can be dealt with by a visual exploration of their metadata within a 3D collaborative environment. The 3D collaborative environments are promising tools that represent the new frontier of our capacity of learning, understanding, communicating and transmitting culture.
Resumo:
Abstract Web 2.0 applications enabled users to classify information resources using their own vocabularies. The bottom-up nature of these user-generated classification systems have turned them into interesting knowledge sources, since they provide a rich terminology generated by potentially large user communities. Previous research has shown that it is possible to elicit some emergent semantics from the aggregation of individual classifications in these systems. However the generation of ontologies from them is still an open research problem. In this thesis we address the problem of how to tap into user-generated classification systems for building domain ontologies. Our objective is to design a method to develop domain ontologies from user-generated classifications systems. To do so, we rely on ontologies in the Web of Data to formalize the semantics of the knowledge collected from the classification system. Current ontology development methodologies have recognized the importance of reusing knowledge from existing resources. Thus, our work is framed within the NeOn methodology scenario for building ontologies by reusing and reengineering non-ontological resources. The main contributions of this work are: An integrated method to develop ontologies from user-generated classification systems. With this method we extract a domain terminology from the classification system and then we formalize the semantics of this terminology by reusing ontologies in the Web of Data. Identification and adaptation of existing techniques for implementing the activities in the method so that they can fulfill the requirements of each activity. A novel study about emerging semantics in user-generated lists. Resumen La web 2.0 permitió a los usuarios clasificar recursos de información usando su propio vocabulario. Estos sistemas de clasificación generados por usuarios son recursos interesantes para la extracción de conocimiento debido principalmente a que proveen una extensa terminología generada por grandes comunidades de usuarios. Se ha demostrado en investigaciones previas que es posible obtener una semántica emergente de estos sistemas. Sin embargo la generación de ontologías a partir de ellos es todavía un problema de investigación abierto. Esta tesis trata el problema de cómo aprovechar los sistemas de clasificación generados por usuarios en la construcción de ontologías de dominio. Así el objetivo de la tesis es diseñar un método para desarrollar ontologías de dominio a partir de sistemas de clasificación generados por usuarios. El método propuesto reutiliza conceptualizaciones existentes en ontologías publicadas en la Web de Datos para formalizar la semántica del conocimiento que se extrae del sistema de clasificación. Por tanto, este trabajo está enmarcado dentro del escenario para desarrollar ontologías mediante la reutilización y reingeniería de recursos no ontológicos que se ha definido en la Metodología NeOn. Las principales contribuciones de este trabajo son: Un método integrado para desarrollar una ontología de dominio a partir de sistemas de clasificación generados por usuarios. En este método se extrae una terminología de dominio del sistema de clasificación y posteriormente se formaliza su semántica reutilizando ontologías en la Web de Datos. La identificación y adaptación de un conjunto de técnicas para implementar las actividades propuestas en el método de tal manera que puedan cumplir automáticamente los requerimientos de cada actividad. Un novedoso estudio acerca de la semántica emergente en las listas generadas por usuarios en la Web.
Resumo:
La integración de fuentes de información heterogéneas ha sido un problema abordado en diferentes tipos de fuentes a lo largo de las décadas de diferentes maneras. Una de ellas es el establecimiento de unas relaciones semánticas que permitan poder unir la información de las fuentes relacionadas. A estos enlaces, claves en la integración, se les ha llamado generalmente mappings. Los mappings se han usado en multitud de trabajos y se han abordado, de manera más práctica que teórica en muchos casos, diferentes soluciones para su descubrimiento, su almacenaje, su explotación, etc. Sin embargo, aunque han sido muchas las contribuciones sobre mappings, no hay una definición generalizada y admitida por la comunidad que cubra todos los aspectos vinculados a los mappings. Además, en su proceso de descubrimiento, no existe un marco teórico que defina metódicamente los procesos a seguir y sus características. Igualmente, la actual forma de evaluar el descubrimiento de mappings no es suficiente para toda la casuística existente. En este trabajo se aporta una definición de mapping génerica que engloba todos los sistemas actuales, la especificación detallada del proceso de descubrimiento y el análisis y la propuesta de un proceso de evaluación del descubrimiento. La validez de estos aportes se comprueba con la formulación de hipótesis y su comprobación mediante un estudio cuantitativo sobre un caso de uso con recursos geoespaciales heterogéneos. ABSTRACT The integration of heterogeneous information resources has been an issue addressed in different types of sources over the decades in different ways. One of them is the establishment of semantic relations which allow information from different related resources to be linked. These links, crucial pieces of this integration, are usually known as mappings. These mappings have been widely used in many applications, and different solutions for their discovery, storing, explotation, etc. have been presented, following rather a more practical than theoretical way in many cases. However, although mappings have been widely applied by many researchers, there is a lack of a generally accepted definition that can cover all the aspects related to mappings. Moreover, in the process of mapping discovery, there is not a theoretical framework that defines methodically the processes to be followed and their characteristics. Similarly, the current way of assessing or evaluating the discovery of mappings is insufficient for all the existing use cases. The main contributions of this work are threefold. On the one hand, it presents a general definition of "mapping" which covers all current systems. On the other hand, it describes a detailed specification of the discovery process, and, finally, it faces the analysis and the purpose of the evaluation of this discovery process. The validity of these contributions has been checked with the formulation of hypothesis which have been verified by using heterogeneous geospatial resources in a quantitative study.
Resumo:
Speech Technologies can provide important benefits for the development of more usable and safe in-vehicle human-machine interactive systems (HMIs). However mainly due robustness issues, the use of spoken interaction can entail important distractions to the driver. In this challenging scenario, while speech technologies are evolving, further research is necessary to explore how they can be complemented with both other modalities (multimodality) and information from the increasing number of available sensors (context-awareness). The perceived quality of speech technologies can significantly be increased by implementing such policies, which simply try to make the best use of all the available resources; and the in vehicle scenario is an excellent test-bed for this kind of initiatives. In this contribution we propose an event-based HMI design framework which combines context modelling and multimodal interaction using a W3C XML language known as SCXML. SCXML provides a general process control mechanism that is being considered by W3C to improve both voice interaction (VoiceXML) and multimodal interaction (MMI). In our approach we try to anticipate and extend these initiatives presenting a flexible SCXML-based approach for the design of a wide range of multimodal context-aware HMI in-vehicle interfaces. The proposed framework for HMI design and specification has been implemented in an automotive OSGi service platform, and it is being used and tested in the Spanish research project MARTA for the development of several in-vehicle interactive applications.
Resumo:
The access to medical literature collections such as PubMed, MedScape or Cochrane has been increased notably in the last years by the web-based tools that provide instant access to the information. However, more sophisticated methodologies are needed to exploit efficiently all that information. The lack of advanced search methods in clinical domain produce that even using well-defined questions for a particular disease, clinicians receive too many results. Since no information analysis is applied afterwards, some relevant results which are not presented in the top of the resultant collection could be ignored by the expert causing an important loose of information. In this work we present a new method to improve scientific article search using patient information for query generation. Using federated search strategy, it is able to simultaneously search in different resources and present a unique relevant literature collection. And applying NLP techniques it presents semantically similar publications together, facilitating the identification of relevant information to clinicians. This method aims to be the foundation of a collaborative environment for sharing clinical knowledge related to patients and scientific publications.
Resumo:
One of the challenges facing the current web is the efficient use of all the available information. The Web 2.0 phenomenon has favored the creation of contents by average users, and thus the amount of information that can be found for diverse topics has grown exponentially in the last years. Initiatives such as linked data are helping to build the Semantic Web, in which a set of standards are proposed for the exchange of data among heterogeneous systems. However, these standards are sometimes not used, and there are still plenty of websites that require naive techniques to discover their contents and services. This paper proposes an integrated framework for content and service discovery and extraction. The framework is divided into several layers where the discovery of contents and services is made in a representational stateless transfer system such as the web. It employs several web mining techniques as well as feature-oriented modeling for the discovery of cross-cutting features in web resources. The framework is used in a scenario of electronic newspapers. An intelligent agent crawls the web for related news, and uses services and visits links automatically according to its goal. This scenario illustrates how the discovery is made at different levels and how the use of semantics helps implement an agent that performs high-level tasks.
Resumo:
The availability of electronic health data favors scientific advance through the creation of repositories for secondary use. Data anonymization is a mandatory step to comply with current legislation. A service for the pseudonymization of electronic healthcare record (EHR) extracts aimed at facilitating the exchange of clinical information for secondary use in compliance with legislation on data protection is presented. According to ISO/TS 25237, pseudonymization is a particular type of anonymization. This tool performs the anonymizations by maintaining three quasi-identifiers (gender, date of birth and place of residence) with a degree of specification selected by the user. The developed system is based on the ISO/EN 13606 norm using its characteristics specifically favorable for anonymization. The service is made up of two independent modules: the demographic server and the pseudonymizing module. The demographic server supports the permanent storage of the demographic entities and the management of the identifiers. The pseudonymizing module anonymizes the ISO/EN 13606 extracts. The pseudonymizing process consists of four phases: the storage of the demographic information included in the extract, the substitution of the identifiers, the elimination of the demographic information of the extract and the elimination of key data in free-text fields. The described pseudonymizing system was used in three Telemedicine research projects with satisfactory results. A problem was detected with the type of data in a demographic data field and a proposal for modification was prepared for the group in charge of the drawing up and revision of the ISO/EN 13606 norm.
Resumo:
Language resources, such as multilingual lexica and multilingual electronic dictionaries, contain collections of lexical entries in several languages. Having access to the corresponding explicit or implicit translation relations between such entries might be of great interest for many NLP-based applications. By using Semantic Web-based techniques, translations can be available on the Web to be consumed by other (semantic enabled) resources in a direct manner, not relying on application-specific formats. To that end, in this paper we propose a model for representing translations as linked data, as an extension of the lemon model. Our translation module represents some core information associated to term translations and does not commit to specific views or translation theories. As a proof of concept, we have extracted the translations of the terms contained in Terminesp, a multilingual terminological database, and represented them as linked data. We have made them accessible on the Web both for humans (via a Web interface) and software agents (with a SPARQL endpoint).