967 resultados para Natural language techniques, Semantic spaces, Random projection, Documents
Resumo:
Researchers suggest that personalization on the Semantic Web adds up to a Web 3.0 eventually. In this Web, personalized agents process and thus generate the biggest share of information rather than humans. In the sense of emergent semantics, which supplements traditional formal semantics of the Semantic Web, this is well conceivable. An emergent Semantic Web underlying fuzzy grassroots ontology can be accomplished through inducing knowledge from users' common parlance in mutual Web 2.0 interactions [1]. These ontologies can also be matched against existing Semantic Web ontologies, to create comprehensive top-level ontologies. On the Web, if augmented with information in the form of restrictions andassociated reliability (Z-numbers) [2], this collection of fuzzy ontologies constitutes an important basis for an implementation of Zadeh's restriction-centered theory of reasoning and computation (RRC) [3]. By considering real world's fuzziness, RRC differs from traditional approaches because it can handle restrictions described in natural language. A restriction is an answer to a question of the value of a variable such as the duration of an appointment. In addition to mathematically well-defined answers, RRC can likewise deal with unprecisiated answers as "about one hour." Inspired by mental functions, it constitutes an important basis to leverage present-day Web efforts to a natural Web 3.0. Based on natural language information, RRC may be accomplished with Z-number calculation to achieve a personalized Web reasoning and computation. Finally, through Web agents' understanding of natural language, they can react to humans more intuitively and thus generate and process information.
Resumo:
Traditionally, ontologies describe knowledge representation in a denotational, formalized, and deductive way. In addition, in this paper, we propose a semiotic, inductive, and approximate approach to ontology creation. We define a conceptual framework, a semantics extraction algorithm, and a first proof of concept applying the algorithm to a small set of Wikipedia documents. Intended as an extension to the prevailing top-down ontologies, we introduce an inductive fuzzy grassroots ontology, which organizes itself organically from existing natural language Web content. Using inductive and approximate reasoning to reflect the natural way in which knowledge is processed, the ontology’s bottom-up build process creates emergent semantics learned from the Web. By this means, the ontology acts as a hub for computing with words described in natural language. For Web users, the structural semantics are visualized as inductive fuzzy cognitive maps, allowing an initial form of intelligence amplification. Eventually, we present an implementation of our inductive fuzzy grassroots ontology Thus,this paper contributes an algorithm for the extraction of fuzzy grassroots ontologies from Web data by inductive fuzzy classification.
Resumo:
Online reputation management deals with monitoring and influencing the online record of a person, an organization or a product. The Social Web offers increasingly simple ways to publish and disseminate personal or opinionated information, which can rapidly have a disastrous influence on the online reputation of some of the entities. This dissertation can be split into three parts: In the first part, possible fuzzy clustering applications for the Social Semantic Web are investigated. The second part explores promising Social Semantic Web elements for organizational applications,while in the third part the former two parts are brought together and a fuzzy online reputation analysis framework is introduced and evaluated. Theentire PhD thesis is based on literature reviews as well as on argumentative-deductive analyses.The possible applications of Social Semantic Web elements within organizations have been researched using a scenario and an additional case study together with two ancillary case studies—based on qualitative interviews. For the conception and implementation of the online reputation analysis application, a conceptual framework was developed. Employing test installations and prototyping, the essential parts of the framework have been implemented.By following a design sciences research approach, this PhD has created two artifacts: a frameworkand a prototype as proof of concept. Bothartifactshinge on twocoreelements: a (cluster analysis-based) translation of tags used in the Social Web to a computer-understandable fuzzy grassroots ontology for the Semantic Web, and a (Topic Maps-based) knowledge representation system, which facilitates a natural interaction with the fuzzy grassroots ontology. This is beneficial to the identification of unknown but essential Web data that could not be realized through conventional online reputation analysis. Theinherent structure of natural language supports humans not only in communication but also in the perception of the world. Fuzziness is a promising tool for transforming those human perceptions intocomputer artifacts. Through fuzzy grassroots ontologies, the Social Semantic Web becomes more naturally and thus can streamline online reputation management.
Resumo:
In the beginning of the 90s, ontology development was similar to an art: ontology developers did not have clear guidelines on how to build ontologies but only some design criteria to be followed. Work on principles, methods and methodologies, together with supporting technologies and languages, made ontology development become an engineering discipline, the so-called Ontology Engineering. Ontology Engineering refers to the set of activities that concern the ontology development process and the ontology life cycle, the methods and methodologies for building ontologies, and the tool suites and languages that support them. Thanks to the work done in the Ontology Engineering field, the development of ontologies within and between teams has increased and improved, as well as the possibility of reusing ontologies in other developments and in final applications. Currently, ontologies are widely used in (a) Knowledge Engineering, Artificial Intelligence and Computer Science, (b) applications related to knowledge management, natural language processing, e-commerce, intelligent information integration, information retrieval, database design and integration, bio-informatics, education, and (c) the Semantic Web, the Semantic Grid, and the Linked Data initiative. In this paper, we provide an overview of Ontology Engineering, mentioning the most outstanding and used methodologies, languages, and tools for building ontologies. In addition, we include some words on how all these elements can be used in the Linked Data initiative.
Resumo:
In the context of the Semantic Web, natural language descriptions associated with ontologies have proven to be of major importance not only to support ontology developers and adopters, but also to assist in tasks such as ontology mapping, information extraction, or natural language generation. In the state-of-the-art we find some attempts to provide guidelines for URI local names in English, and also some disagreement on the use of URIs for describing ontology elements. When trying to extrapolate these ideas to a multilingual scenario, some of these approaches fail to provide a valid solution. On the basis of some real experiences in the translation of ontologies from English into Spanish, we provide a preliminary set of guidelines for naming and labeling ontologies in a multilingual scenario.
Resumo:
This paper describes the design, development and field evaluation of a machine translation system from Spanish to Spanish Sign Language (LSE: Lengua de Signos Española). The developed system focuses on helping Deaf people when they want to renew their Driver’s License. The system is made up of a speech recognizer (for decoding the spoken utterance into a word sequence), a natural language translator (for converting a word sequence into a sequence of signs belonging to the sign language), and a 3D avatar animation module (for playing back the signs). For the natural language translator, three technological approaches have been implemented and evaluated: an example-based strategy, a rule-based translation method and a statistical translator. For the final version, the implemented language translator combines all the alternatives into a hierarchical structure. This paper includes a detailed description of the field evaluation. This evaluation was carried out in the Local Traffic Office in Toledo involving real government employees and Deaf people. The evaluation includes objective measurements from the system and subjective information from questionnaires. The paper details the main problems found and a discussion on how to solve them (some of them specific for LSE).
Resumo:
In the context of the Semantic Web, resources on the net can be enriched by well-defined, machine-understandable metadata describing their associated conceptual meaning. These metadata consisting of natural language descriptions of concepts are the focus of the activity we describe in this chapter, namely, ontology localization. In the framework of the NeOn Methodology, ontology localization is defined as the activity of adapting an ontology to a particular language and culture. This adaptation mainly involves the translation of the natural language descriptions of the ontology from a source natural language to a target natural language, with the final objective of obtaining a multilingual ontology, that is, an ontology documented in several natural languages. The purpose of this chapter is to provide detailed and prescriptive methodological guidelines to support the performance of this activity.
Resumo:
Effective data summarization methods that use AI techniques can help humans understand large sets of data. In this paper, we describe a knowledge-based method for automatically generating summaries of geospatial and temporal data, i.e. data with geographical and temporal references. The method is useful for summarizing data streams, such as GPS traces and traffic information, that are becoming more prevalent with the increasing use of sensors in computing devices. The method presented here is an initial architecture for our ongoing research in this domain. In this paper we describe the data representations we have designed for our method, our implementations of components to perform data abstraction and natural language generation. We also discuss evaluation results that show the ability of our method to generate certain types of geospatial and temporal descriptions.
Resumo:
Vivimos en una época en la que cada vez existe una mayor cantidad de información. En el dominio de la salud la historia clínica digital ha permitido digitalizar toda la información de los pacientes. Estas historias clínicas digitales contienen una gran cantidad de información valiosa escrita en forma narrativa que sólo podremos extraer recurriendo a técnicas de procesado de lenguaje natural. No obstante, si se quiere realizar búsquedas sobre estos textos es importante analizar que la información relativa a síntomas, enfermedades, tratamientos etc. se puede refererir al propio paciente o a sus antecentes familiares, y que ciertos términos pueden aparecer negados o ser hipotéticos. A pesar de que el español ocupa la segunda posición en el listado de idiomas más hablados con más de 500 millones de hispano hablantes, hasta donde tenemos de detección de la negación, probabilidad e histórico en textos clínicos en español. Por tanto, este Trabajo Fin de Grado presenta una implementación basada en el algoritmo ConText para la detección de la negación, probabilidad e histórico en textos clínicos escritos en español. El algoritmo se ha validado con 454 oraciones que incluían un total de 1897 disparadores obteniendo unos resultado de 83.5 %, 96.1 %, 96.9 %, 99.7% y 93.4% de exactitud con condiciones afirmados, negados, probable, probable negado e histórico respectivamente. ---ABSTRACT---We live in an era in which there is a huge amount of information. In the domain of health, the electronic health record has allowed to digitize all the information of the patients. These electronic health records contain valuable information written in narrative form that can only be extracted using techniques of natural language processing. However, if you want to search on these texts is important to analyze if the relative information about symptoms, diseases, treatments, etc. are referred to the patient or family casework, and that certain terms may appear negated or be hypothesis. Although Spanish is the second spoken language with more than 500 million speakers, there seems to be no method of detection of negation, hypothesis or historical in medical texts written in Spanish. Thus, this bachelor’s final degree presents an implementation based on the ConText algorithm for the detection of negation, hypothesis and historical in medical texts written in Spanish. The algorithm has been validated with 454 sentences that included a total of 1897 triggers getting a result of 83.5 %, 96.1 %, 96.9 %, 99.7% and 93.4% accuracy with affirmed, negated, hypothesis, negated hypothesis and historical respectively.
Resumo:
El presente trabajo desarrolla un servicio REST que transforma frases en lenguaje natural a grafos RDF. Los grafos generados son grafos dirigidos, donde los nodos se forman con los sustantivos o adjetivos de las frases, y los arcos se forman con los verbos. Se utiliza dentro del proyecto p-medicine para dar soporte a las siguientes funcionalidades: Búsquedas en lenguaje natural: actualmente la plataforma p-medicine proporciona un interfaz programático para realizar consultas en SPARQL. El servicio desarrollado permitiría generar esas consultas automáticamente a partir de frases en lenguaje natural. Anotaciones de bases de datos mediante lenguaje natural: la plataforma pmedicine incorpora una herramienta, desarrollada por el Grupo de Ingeniería Biomédica de la Universidad Politécnica de Madrid, para la anotación de bases de datos RDF. Estas anotaciones son necesarias para la posterior traducción de las bases de datos a un esquema central. El proceso de anotación requiere que el usuario construya de forma manual las vistas RDF que desea anotar, lo que requiere mostrar gráficamente el esquema RDF y que el usuario construya vistas RDF seleccionando las clases y relaciones necesarias. Este proceso es a menudo complejo y demasiado difícil para un usuario sin perfil técnico. El sistema se incorporará para permitir que la construcción de estas vistas se realice con lenguaje natural. ---ABSTRACT---The present work develops a REST service that transforms natural language sentences to RDF degrees. Generated graphs are directed graphs where nodes are formed with nouns or adjectives of phrases, and the arcs are formed with verbs. Used within the p-medicine project to support the following functionality: Natural language queries: currently the p-medicine platform provides a programmatic interface to query SPARQL. The developed service would automatically generate those queries from natural language sentences. Memos databases using natural language: the p-medicine platform incorporates a tool, developed by the Group of Biomedical Engineering at the Polytechnic University of Madrid, for the annotation of RDF data bases. Such annotations are necessary for the subsequent translation of databases to a central scheme. The annotation process requires the user to manually construct the RDF views that he wants annotate, requiring graphically display the RDF schema and the user to build RDF views by selecting classes and relationships. This process is often complex and too difficult for a user with no technical background. The system is incorporated to allow the construction of these views to be performed with natural language.
Resumo:
This paper describes the application of language translation technologies for generating bus information in Spanish Sign Language (LSE: Lengua de Signos Española). In this work, two main systems have been developed: the first for translating text messages from information panels and the second for translating spoken Spanish into natural conversations at the information point of the bus company. Both systems are made up of a natural language translator (for converting a word sentence into a sequence of LSE signs), and a 3D avatar animation module (for playing back the signs). For the natural language translator, two technological approaches have been analyzed and integrated: an example-based strategy and a statistical translator. When translating spoken utterances, it is also necessary to incorporate a speech recognizer for decoding the spoken utterance into a word sequence, prior to the language translation module. This paper includes a detailed description of the field evaluation carried out in this domain. This evaluation has been carried out at the customer information office in Madrid involving both real bus company employees and deaf people. The evaluation includes objective measurements from the system and information from questionnaires. In the field evaluation, the whole translation presents an SER (Sign Error Rate) of less than 10% and a BLEU greater than 90%.
Resumo:
An important part of human intelligence, both historically and operationally, is our ability to communicate. We learn how to communicate, and maintain our communicative skills, in a society of communicators – a highly effective way to reach and maintain proficiency in this complex skill. Principles that might allow artificial agents to learn language this way are in completely known at present – the multi-dimensional nature of socio-communicative skills are beyond every machine learning framework so far proposed. Our work begins to address the challenge of proposing a way for observation-based machine learning of natural language and communication. Our framework can learn complex communicative skills with minimal up-front knowledge. The system learns by incrementally producing predictive models of causal relationships in observed data, guided by goal-inference and reasoning using forward-inverse models. We present results from two experiments where our S1 agent learns human communication by observing two humans interacting in a realtime TV-style interview, using multimodal communicative gesture and situated language to talk about recycling of various materials and objects. S1 can learn multimodal complex language and multimodal communicative acts, a vocabulary of 100 words forming natural sentences with relatively complex sentence structure, including manual deictic reference and anaphora. S1 is seeded only with high-level information about goals of the interviewer and interviewee, and a small ontology; no grammar or other information is provided to S1 a priori. The agent learns the pragmatics, semantics, and syntax of complex utterances spoken and gestures from scratch, by observing the humans compare and contrast the cost and pollution related to recycling aluminum cans, glass bottles, newspaper, plastic, and wood. After 20 hours of observation S1 can perform an unscripted TV interview with a human, in the same style, without making mistakes.
Resumo:
This document will be divided into two main parts. The first one will be the classification of the authentication techniques. We will search the main electronic databases for papers related to authentication techniques. We will then summarize the related papers and show what classifications they use for the authentication techniques. After all of the documents have been read and summarized we will analyse them and group the authentication techniques into the classifications found. For the second part of the document we will focus on the study of usability attributes in the authentication techniques. This to know how authentications techniques compare to one another based on their usability attributes. We will search the main electronic databases for papers related to the usability attributes of authentication techniques based on the usability definition of ISO/IEC 25010 (SQuaRE) and its attributes. We will then summarize the related papers and show what authentication methods they describe and which usability attributes they measure. After all of the documents have been read and summarized we will analyse them depending on their usability attribute. At the end we will elaborate those results to show which authentication techniques have better usability in terms of a specific usability attribute. This will help practitioners who are interested in using authentication methods but want or need to focus on a specific usability attribute. They will be able to use this as a guide to help them chose the best option that fits their purpose.
Resumo:
The mobile apps market is a tremendous success, with millions of apps downloaded and used every day by users spread all around the world. For apps’ developers, having their apps published on one of the major app stores (e.g. Google Play market) is just the beginning of the apps lifecycle. Indeed, in order to successfully compete with the other apps in the market, an app has to be updated frequently by adding new attractive features and by fixing existing bugs. Clearly, any developer interested in increasing the success of her app should try to implement features desired by the app’s users and to fix bugs affecting the user experience of many of them. A precious source of information to decide how to collect users’ opinions and wishes is represented by the reviews left by users on the store from which they downloaded the app. However, to exploit such information the app’s developer should manually read each user review and verify if it contains useful information (e.g. suggestions for new features). This is something not doable if the app receives hundreds of reviews per day, as happens for the very popular apps on the market. In this work, our aim is to provide support to mobile apps developers by proposing a novel approach exploiting data mining, natural language processing, machine learning, and clustering techniques in order to classify the user reviews on the basis of the information they contain (e.g. useless, suggestion for new features, bugs reporting). Such an approach has been empirically evaluated and made available in a web-‐based tool publicly available to all apps’ developers. The achieved results showed that the developed tool: (i) is able to correctly categorise user reviews on the basis of their content (e.g. isolating those reporting bugs) with 78% of accuracy, (ii) produces clusters of reviews (e.g. groups together reviews indicating exactly the same bug to be fixed) that are meaningful from a developer’s point-‐of-‐view, and (iii) is considered useful by a software company working in the mobile apps’ development market.
Resumo:
Tradicionalmente, los entornos virtuales se han relacionado o vinculado de forma muy estrecha con campos como el diseño de escenarios tridimensionales o los videojuegos; dejando poco margen a poder pensar en sus aplicaciones en otros ámbitos. Sin embargo, estas tendencias pueden cambiar en tanto se demuestre que las aplicaciones y ventajas de estas facilidades software, se pueden extrapolar a su uso en el ámbito de la enseñanza y el aprendizaje. Estas aplicaciones son los conocidos como Entornos Virtuales Inteligentes (EVI); los cuales, tratan de usar un entorno virtual para llevar a cabo labores de enseñanza y tutoría, aportando ventajas como simulación de entornos peligrosos o tutorización personalizada; cosa que no podemos encontrar en la mayoría de los casos de las situaciones de enseñanza reales. Este trabajo trata de dar solución a una de las problemáticas que se plantean a la hora de trabajar con cualquier entorno virtual con el que nos encontremos y prepararlo para su cometido, sobre todo en aquellos enfocados a la enseñanza: dotar de forma automática e inteligente de una semántica propia a cada uno de los objetos que se encuentran en un entorno virtual y almacenar esta información para su posterior consulta o uso para otras tareas. Esto quiere decir que el objetivo principal de este trabajo, es el proceso de recolección de información que se considera importante de los objetos de los entornos virtuales, como pueden ser sus aspectos de la forma, tamaño o color. Aspectos que, por otra parte, son realmente importantes para poder caracterizar los objetos y hacerlos únicos en un entorno virtual donde, a priori, todos los objetos son los mismos a ojos de un ordenador. Este trabajo que puede parecer trivial en un principio, no lo es tanto; y servirá de sustento fundamental para que otras aplicaciones futuras o ya existentes puedan realizar sus tareas. Una de estas tareas pudiera ser la generación de indicaciones en lenguaje natural para guiar a usuarios a localizar objetos en un entorno virtual, como es el caso del proyecto LORO sobre el que se engloba este trabajo. Algunos ejemplos de uso de esta tarea pueden ser desde ayudar a cualquier usuario a encontrar sus llaves en su propia casa a ayudar a un cirujano a localizar cierta herramienta en un quirófano. Para ello, es indispensable conocer la semántica e información relevante de cada uno de los objetos que se presentan en la escena y diferenciarlos claramente del resto. La solución propuesta se trata de una completa aplicación integrada en el motor de videojuegos y escenarios 3D de mayor soporte del mundo como es Unity 3D, el cual se interrelaciona con ontologías para poder guardar la información de los objetos de cada escena. Esto hace que la aplicación tenga una potencial difusión, gracias a las herramientas antes mencionadas para su desarrollo y a que está pensada para tanto el usuario experto como el usuario común.---ABSTRACT---Traditionally, virtual environments have been related to tridimensional design and videogames; leaving a little margin to think about its applications in other fields. However, this tendencies can be changed as soon as it is proven that the applications and advantages of this software can be taken to the learning and teaching environment. This applications are known as intelligent virtual environments, these use the virtual environment to perform teaching and tutoring tasks; tasks we cannot find in most real life teaching situations. This project aims to give a solution to one of the problematics that appears when someone works with any virtual environments we may encounter and prepare it for its duty, mainly those environments dedicated to teaching: automatically and intelligently give its own semantic to the objects that are in any virtual environment and save this information for its posterior query or use in other tasks. The main purpose of this project is the information recollection process that considers the different important facts about the objects that are in the virtual environments, such as their shape, size or color. Facts that are very important for characterizing the objects; to make them unique in the environment where the objects are all the same to the computer’s eye. This project may seem banal in the beginning, but it is not, it will be the fundamental base for future applications. One of this applications may be a natural language indicator generator for guiding users to locate objects in a virtual environment, such as the LORO project, where this project is included. Some examples of the use of this task are: helping any user to find the keys of his house, helping a surgeon to find a tool in an operation room… For this goals, it is very important to know the semantics and the relevant information of each object of the scenario and differentiate each one of them from the rest. The solution for this proposal is a fully integrated application in the videogame and Unity 3D engine that is related to ontologies so it can save the object’s information in every scenario. The previously mentioned tools, as well as the idea that this application is made for an expert user as well as for a common user, make the application more spreadable.