939 resultados para Open Information Extraction
Resumo:
A selection of MeO-BDE and BDE congeners were analyzed in pooled blubber samples of pilot whale (Globicephala melas), ringed seal (Phoca hispida), minke whale (Balaenoptera acutorostrata), fin whale (Balaenoptera physalus), harbor porpoise (Phocoena phocoena), hooded seal (Cystophora cristata), and Atlantic white-sided dolphin (Lagenorhynchus acutus), covering a time period of more than 20 years (1986-2009). The analytes were extracted and cleaned-up using open column extraction and multi-layer silica gel column chromatography. The analysis was performed using both low resolution and high resolution GC-MS. MeO-PBDE concentrations relative to total PBDE concentrations varied greatly between sampling periods and species. The highest MeO-PBDE levels were found in the toothed whale species pilot whale and white-sided dolphin, often exceeding the concentration of the most abundant PBDE, BDE-47. The lowest MeO-PBDE levels were found in fin whales and ringed seals. The main MeO-BDE congeners were 6-MeO-BDE47 and 2'-MeO-BDE68. A weak correlation only between BDE47 and its methoxylated analog 6-MeO-BDE47 was found and is indicative of a natural source for MeO-PBDEs.
Resumo:
In the context of the Semantic Web, natural language descriptions associated with ontologies have proven to be of major importance not only to support ontology developers and adopters, but also to assist in tasks such as ontology mapping, information extraction, or natural language generation. In the state-of-the-art we find some attempts to provide guidelines for URI local names in English, and also some disagreement on the use of URIs for describing ontology elements. When trying to extrapolate these ideas to a multilingual scenario, some of these approaches fail to provide a valid solution. On the basis of some real experiences in the translation of ontologies from English into Spanish, we provide a preliminary set of guidelines for naming and labeling ontologies in a multilingual scenario.
Resumo:
This paper introduces a semantic language developed with the objective to be used in a semantic analyzer based on linguistic and world knowledge. Linguistic knowledge is provided by a Combinatorial Dictionary and several sets of rules. Extra-linguistic information is stored in an Ontology. The meaning of the text is represented by means of a series of RDF-type triples of the form predicate (subject, object). Semantic analyzer is one of the options of the multifunctional ETAP-3 linguistic processor. The analyzer can be used for Information Extraction and Question Answering. We describe semantic representation of expressions that provide an assessment of the number of objects involved and/or give a quantitative evaluation of different types of attributes. We focus on the following aspects: 1) parametric and non-parametric attributes; 2) gradable and non-gradable attributes; 3) ontological representation of different classes of attributes; 4) absolute and relative quantitative assessment; 5) punctual and interval quantitative assessment; 6) intervals with precise and fuzzy boundaries
Resumo:
The identification of malnourished children living under extreme poverty conditions in isolated areas is crucial to trigger urgent interventions like supplementary or therapeutic feeding. This work aims to strengthen the task of following-up malnourished maternal-child population in rural areas of developing countries like Nicaragua. The solution facilitates low-cost health nutritional remote monitoring to support rural communities at the point of care. Thus, the system allows medical staff to communicate with brigades, who transmit anthropometric measurements, such as weight and height of the children, from communities which are sited about 12 km. far away. A hybrid WiMAX/WiFi architecture was deployed to provide affordable communications between the isolated communities and the health center. Furthermore, a free PBX software and an open information system, installed at the health center, support WiFi based mobile communications and information management to support the care needs of maternal-child population at risk.
Resumo:
Las Tecnologías de la Información y las Comunicaciones han propiciado avances en el contexto de la salud tanto en la gestión efectiva de información socio‐sanitaria de forma electrónica, como en la provisión de servicios de e‐salud y telemedicina. Los antecedentes de investigación publicados en esta área corroboran este hecho presentando las mejoras experimentadas en la atención de la población y en la provisión de servicios sanitarios. La atención temprana, cuyos principios científicos se fundamentan en los campos de la pediatría, neurología, psicología, psiquiatría, pedagogía, fisiatría y lingüística, entre otros, tiene como finalidad ofrecer a los niños con déficit o con riesgo de padecerlos un conjunto de acciones optimizadoras y compensadoras, que faciliten su adecuada maduración en todos los ámbitos y que les permita alcanzar el máximo nivel de desarrollo personal y de integración social. La detección de posibles alteraciones en el desarrollo infantil es un aspecto clave de la atención temprana en la medida en que puede posibilitar la puesta en marcha de diversos mecanismos de actuación disponibles en las entidades implicadas, valiosos para la calidad de vida de la persona. Cuanto antes se realice la detección, existen mayores garantías de prevenir patologías añadidas, lograr mejoras funcionales y posibilitar un ajuste más adaptativo entre el niño y su entorno. El objetivo de la investigación presentada en esta tesis doctoral es analizar, diseñar, verificar y validar un sistema de información abierto, basado en conocimiento, que facilite efectivamente a los profesionales que trabajan con la población infantil entre 0 y 6 años la detección precoz de posibles trastornos del lenguaje. Desde el punto de vista metodológico, la Ingeniería del Conocimiento ofrece un marco conceptual sólido que permite desarrollar y validar Sistemas de Ayuda a la Toma de Decisiones distribuidos y escalables, capaces de ayudar al pediatra de Atención Primaria y al educador infantil en la detección precoz de posibles trastornos del lenguaje en niños. La evaluación del sistema se ha realizado de forma incremental mediante el diseño y validación de pruebas de campo experimentales consistentes en la evaluación de niños en dos escenarios distintos: la escuela infantil y el centro de atención temprana. Los experimentos realizados en poblaciones distintas con alrededor de 344 niños durante 2 años, han permitido contrastar la buena adecuación del sistema propuesto a las necesidades de detección de los profesionales que trabajan con niños entre 0 y 6 años. La tesis resultante ha permitido caracterizar el uso del sistema en entornos reales, conocer la aceptación entre los usuarios y su impacto en la provisión de un servicio de atención temprana como el descrito para el correcto seguimiento del desarrollo del lenguaje en los niños, además de proponer un nuevo modelo de atención y evaluación cooperativa que permita incrementar el conocimiento experimental existente al respecto. ABSTRACT The Information and Communication Technology have led to advances in the context of health both in the effective management of socio‐health information electronically, and in the provision of e‐health and telemedicine. The history of research published in this area confirm this fact by presenting the improvements in the care of the population and the provision of health services. Early attention, whose scientific principles are based on the fields of pediatrics, neurology, psychology, psychiatry, pedagogy, physical medicine and linguistics, among others, aims to provide children with deficits or risk of suffering a set of enhancer actions, which facilitate adequate maturation in all areas and allow them to achieve the highest level of personal development and social integration. The detection of possible changes in child development is a key aspect of early intervention to the extent that it can enable the implementation of different mechanisms of action available to the entities involved, valuable to the quality of life of the person. The earlier the detection is made, there are more guarantees added to prevent diseases, achieving functional improvements and enable a more adaptive fit between the child and his environment. The aim of the research presented is to analyze, design, verify and validate an open information system, based on knowledge, which effectively provide professionals working with the child population between 0 and 6 years, in processes of early detection of language disorders. From the methodological point of view, Knowledge Engineering provides a solid conceptual framework to develop and validate a distributed and scalable decision support systems aim to assist pediatricians and language therapists at early identification and referral of language disorder in childhood. The system evaluation was performed incrementally with the design and validation of consistent experimental field tests in the assessment of children in two different scenarios: the nursery and early intervention center. Experiments in different populations with about 344 children over 2 years, allowed to testing the adequacy of the proposed good detection needs of professionals working with children between 0 and 6 years old system. The resulting thesis has allowed to formalizing the system at real environments and to identifying the acceptance by users as well as its impact on the provision of an early intervention service, such as the one described for the proper monitoring of language development in children. In addition, it proposes a new model of care and cooperative evaluation that lets to increase the existing experimental knowledge about it.
Resumo:
The scientific method is a methodological approach to the process of inquiry { in which empirically grounded theory of nature is constructed and verified [14]. It is a hard, exhaustive and dedicated multi-stage procedure that a researcher must perform to achieve valuable knowledge. Trying to help researchers during this process, a recommender system, intended as a researcher assistant, is designed to provide them useful tools and information for each stage of the procedure. A new similarity measure between research objects and a representational model, based on domain spaces, to handle them in dif ferent levels are created as well as a system to build them from OAI-PMH (and RSS) resources. It tries to represents a sound balance between scientific insight into individual scientific creative processes and technical implementation using innovative technologies in information extraction, document summarization and semantic analysis at a large scale.
Resumo:
La tesis que se presenta tiene como propósito la construcción automática de ontologías a partir de textos, enmarcándose en el área denominada Ontology Learning. Esta disciplina tiene como objetivo automatizar la elaboración de modelos de dominio a partir de fuentes información estructurada o no estructurada, y tuvo su origen con el comienzo del milenio, a raíz del crecimiento exponencial del volumen de información accesible en Internet. Debido a que la mayoría de información se presenta en la web en forma de texto, el aprendizaje automático de ontologías se ha centrado en el análisis de este tipo de fuente, nutriéndose a lo largo de los años de técnicas muy diversas provenientes de áreas como la Recuperación de Información, Extracción de Información, Sumarización y, en general, de áreas relacionadas con el procesamiento del lenguaje natural. La principal contribución de esta tesis consiste en que, a diferencia de la mayoría de las técnicas actuales, el método que se propone no analiza la estructura sintáctica superficial del lenguaje, sino que estudia su nivel semántico profundo. Su objetivo, por tanto, es tratar de deducir el modelo del dominio a partir de la forma con la que se articulan los significados de las oraciones en lenguaje natural. Debido a que el nivel semántico profundo es independiente de la lengua, el método permitirá operar en escenarios multilingües, en los que es necesario combinar información proveniente de textos en diferentes idiomas. Para acceder a este nivel del lenguaje, el método utiliza el modelo de las interlinguas. Estos formalismos, provenientes del área de la traducción automática, permiten representar el significado de las oraciones de forma independiente de la lengua. Se utilizará en concreto UNL (Universal Networking Language), considerado como la única interlingua de propósito general que está normalizada. La aproximación utilizada en esta tesis supone la continuación de trabajos previos realizados tanto por su autor como por el equipo de investigación del que forma parte, en los que se estudió cómo utilizar el modelo de las interlinguas en las áreas de extracción y recuperación de información multilingüe. Básicamente, el procedimiento definido en el método trata de identificar, en la representación UNL de los textos, ciertas regularidades que permiten deducir las piezas de la ontología del dominio. Debido a que UNL es un formalismo basado en redes semánticas, estas regularidades se presentan en forma de grafos, generalizándose en estructuras denominadas patrones lingüísticos. Por otra parte, UNL aún conserva ciertos mecanismos de cohesión del discurso procedentes de los lenguajes naturales, como el fenómeno de la anáfora. Con el fin de aumentar la efectividad en la comprensión de las expresiones, el método provee, como otra contribución relevante, la definición de un algoritmo para la resolución de la anáfora pronominal circunscrita al modelo de la interlingua, limitada al caso de pronombres personales de tercera persona cuando su antecedente es un nombre propio. El método propuesto se sustenta en la definición de un marco formal, que ha debido elaborarse adaptando ciertas definiciones provenientes de la teoría de grafos e incorporando otras nuevas, con el objetivo de ubicar las nociones de expresión UNL, patrón lingüístico y las operaciones de encaje de patrones, que son la base de los procesos del método. Tanto el marco formal como todos los procesos que define el método se han implementado con el fin de realizar la experimentación, aplicándose sobre un artículo de la colección EOLSS “Encyclopedia of Life Support Systems” de la UNESCO. ABSTRACT The purpose of this thesis is the automatic construction of ontologies from texts. This thesis is set within the area of Ontology Learning. This discipline aims to automatize domain models from structured or unstructured information sources, and had its origin with the beginning of the millennium, as a result of the exponential growth in the volume of information accessible on the Internet. Since most information is presented on the web in the form of text, the automatic ontology learning is focused on the analysis of this type of source, nourished over the years by very different techniques from areas such as Information Retrieval, Information Extraction, Summarization and, in general, by areas related to natural language processing. The main contribution of this thesis consists of, in contrast with the majority of current techniques, the fact that the method proposed does not analyze the syntactic surface structure of the language, but explores his deep semantic level. Its objective, therefore, is trying to infer the domain model from the way the meanings of the sentences are articulated in natural language. Since the deep semantic level does not depend on the language, the method will allow to operate in multilingual scenarios, where it is necessary to combine information from texts in different languages. To access to this level of the language, the method uses the interlingua model. These formalisms, coming from the area of machine translation, allow to represent the meaning of the sentences independently of the language. In this particular case, UNL (Universal Networking Language) will be used, which considered to be the only interlingua of general purpose that is standardized. The approach used in this thesis corresponds to the continuation of previous works carried out both by the author of this thesis and by the research group of which he is part, in which it is studied how to use the interlingua model in the areas of multilingual information extraction and retrieval. Basically, the procedure defined in the method tries to identify certain regularities at the UNL representation of texts that allow the deduction of the parts of the ontology of the domain. Since UNL is a formalism based on semantic networks, these regularities are presented in the form of graphs, generalizing in structures called linguistic patterns. On the other hand, UNL still preserves certain mechanisms of discourse cohesion from natural languages, such as the phenomenon of the anaphora. In order to increase the effectiveness in the understanding of expressions, the method provides, as another significant contribution, the definition of an algorithm for the resolution of pronominal anaphora limited to the model of the interlingua, in the case of third person personal pronouns when its antecedent is a proper noun. The proposed method is based on the definition of a formal framework, adapting some definitions from Graph Theory and incorporating new ones, in order to locate the notions of UNL expression and linguistic pattern, as well as the operations of pattern matching, which are the basis of the method processes. Both the formal framework and all the processes that define the method have been implemented in order to carry out the experimentation, applying on an article of the "Encyclopedia of Life Support Systems" of the UNESCO-EOLSS collection.
Resumo:
The goal of the project is to analyze, experiment, and develop intelligent, interactive and multilingual Text Mining technologies, as a key element of the next generation of search engines, systems with the capacity to find "the need behind the query". This new generation will provide specialized services and interfaces according to the search domain and type of information needed. Moreover, it will integrate textual search (websites) and multimedia search (images, audio, video), it will be able to find and organize information, rather than generating ranked lists of websites.
Resumo:
Los métodos para Extracción de Información basados en la Supervisión a Distancia se basan en usar tuplas correctas para adquirir menciones de esas tuplas, y así entrenar un sistema tradicional de extracción de información supervisado. En este artículo analizamos las fuentes de ruido en las menciones, y exploramos métodos sencillos para filtrar menciones ruidosas. Los resultados demuestran que combinando el filtrado de tuplas por frecuencia, la información mutua y la eliminación de menciones lejos de los centroides de sus respectivas etiquetas mejora los resultados de dos modelos de extracción de información significativamente.
Resumo:
In Computer Science world several proposals have been developed for the assessment of the quality of the digital objects, based on the capabilities and facilities offered by current technologies and the available resources. Years ago researchers and specialists from both educational and technological areas have been committed to the development of strategies that improve the quality of education. At present, in the field of teaching-learning, another important aspect is the need to improve the manner of gaining knowledge and learning in education, which the use of learning strategies is a major advance in the teaching-learning process in institutions of higher education. This paper presents QEES, a proposal for evaluating the quality of the learning objects employed on learning strategies to support students during their education processes by using information extraction techniques and ontologies.
Resumo:
Currently there are an overwhelming number of scientific publications in Life Sciences, especially in Genetics and Biotechnology. This huge amount of information is structured in corporate Data Warehouses (DW) or in Biological Databases (e.g. UniProt, RCSB Protein Data Bank, CEREALAB or GenBank), whose main drawback is its cost of updating that makes it obsolete easily. However, these Databases are the main tool for enterprises when they want to update their internal information, for example when a plant breeder enterprise needs to enrich its genetic information (internal structured Database) with recently discovered genes related to specific phenotypic traits (external unstructured data) in order to choose the desired parentals for breeding programs. In this paper, we propose to complement the internal information with external data from the Web using Question Answering (QA) techniques. We go a step further by providing a complete framework for integrating unstructured and structured information by combining traditional Databases and DW architectures with QA systems. The great advantage of our framework is that decision makers can compare instantaneously internal data with external data from competitors, thereby allowing taking quick strategic decisions based on richer data.
Resumo:
Presentamos una herramienta basada en coocurrencias de fármaco-efecto para la detección de reacciones adversas e indicaciones en comentarios de usuarios procedentes de un foro médico en español. Además, se describe la construcción automática de la primera base de datos en español sobre indicaciones y efectos adversos de fármacos.
Resumo:
The Leximancer system is a relatively new method for transforming lexical co-occurrence information from natural language into semantic patterns in an unsupervised manner. It employs two stages of co-occurrence information extraction-semantic and relational-using a different algorithm for each stage. The algorithms used are statistical, but they employ nonlinear dynamics and machine learning. This article is an attempt to validate the output of Leximancer, using a set of evaluation criteria taken from content analysis that are appropriate for knowledge discovery tasks.
Resumo:
Four experiments are reported that examine the ability of cricket batsmen of different skill levels to pick up advance information to anticipate the type and length of balls bowled by swing and spin bowlers. The information available upon which to make the predictive judgements was manipulated through a combination of temporal occlusion of the display and selective occlusion or presentation of putative anticipatory cues. In addition to a capability to pick up advance information from the same cues used by intermediate and low-skilled players, highly skilled players demonstrated the additional, unique capability to pick up advance information from some specific early cues (especially bowling hand and arm cues) to which the less skilled players were not attuned. The acquisition of expert perceptual-motor skill appears to involve not only refinement of information extraction but also progression to the use of earlier, kinematically relevant sources of information.
Resumo:
Government agencies responsible for riparian environments are assessing the combined utility of field survey and remote sensing for mapping and monitoring indicators of riparian zone health. The objective of this work was to determine if the structural attributes of savanna riparian zones in northern Australia can be detected from commercially available remotely sensed image data. Two QuickBird images and coincident field data covering sections of the Daly River and the South Alligator River - Barramundie Creek in the Northern Territory were used. Semi-variograms were calculated to determine the characteristic spatial scales of riparian zone features, both vegetative and landform. Interpretation of semi-variograms showed that structural dimensions of riparian environments could be detected and estimated from the QuickBird image data. The results also show that selecting the correct spatial resolution and spectral bands is essential to maximize the accuracy of mapping spatial characteristics of savanna riparian features. The distribution of foliage projective cover of riparian vegetation affected spectral reflectance variations in individual spectral bands differently. Pan-sharpened image data enabled small-scale information extraction (< 6 m) on riparian zone structural parameters. The semi-variogram analysis results provide the basis for an inversion approach using high spatial resolution satellite image data to map indicators of savanna riparian zone health.