20 resultados para Question-answering systems

em Universidad de Alicante


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The importance of the new textual genres such as blogs or forum entries is growing in parallel with the evolution of the Social Web. This paper presents two corpora of blog posts in English and in Spanish, annotated according to the EmotiBlog annotation scheme. Furthermore, we created 20 factual and opinionated questions for each language and also the Gold Standard for their answers in the corpus. The purpose of our work is to study the challenges involved in a mixed fact and opinion question answering setting by comparing the performance of two Question Answering (QA) systems as far as mixed opinion and factual setting is concerned. The first one is open domain, while the second one is opinion-oriented. We evaluate separately the two systems in both languages and propose possible solutions to improve QA systems that have to process mixed questions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The development of the Web 2.0 led to the birth of new textual genres such as blogs, reviews or forum entries. The increasing number of such texts and the highly diverse topics they discuss make blogs a rich source for analysis. This paper presents a comparative study on open domain and opinion QA systems. A collection of opinion and mixed fact-opinion questions in English is defined and two Question Answering systems are employed to retrieve the answers to these queries. The first one is generic, while the second is specific for emotions. We comparatively evaluate and analyze the systems’ results, concluding that opinion Question Answering requires the use of specific resources and methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a multi-layered Question Answering (Q.A.) architecture suitable for enhancing current Q.A. capabilities with the possibility of processing complex questions. That is, questions whose answer needs to be gathered from pieces of factual information scattered in different documents. Specifically, we have designed a layer oriented to process the different types of temporal questions. Complex temporal questions are first decomposed into simpler ones, according to the temporal relationships expressed in the original question. In the same way, the answers of each simple question are re-composed, fulfilling the temporal restrictions of the original complex question. Using this architecture, a Temporal Q.A. system has been developed. In this paper, we focus on explaining the first part of the process: the decomposition of the complex questions. Furthermore, it has been evaluated with the TERQAS question corpus of 112 temporal questions. For the task of question splitting our system has performed, in terms of precision and recall, 85% and 71%, respectively.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Currently there are an overwhelming number of scientific publications in Life Sciences, especially in Genetics and Biotechnology. This huge amount of information is structured in corporate Data Warehouses (DW) or in Biological Databases (e.g. UniProt, RCSB Protein Data Bank, CEREALAB or GenBank), whose main drawback is its cost of updating that makes it obsolete easily. However, these Databases are the main tool for enterprises when they want to update their internal information, for example when a plant breeder enterprise needs to enrich its genetic information (internal structured Database) with recently discovered genes related to specific phenotypic traits (external unstructured data) in order to choose the desired parentals for breeding programs. In this paper, we propose to complement the internal information with external data from the Web using Question Answering (QA) techniques. We go a step further by providing a complete framework for integrating unstructured and structured information by combining traditional Databases and DW architectures with QA systems. The great advantage of our framework is that decision makers can compare instantaneously internal data with external data from competitors, thereby allowing taking quick strategic decisions based on richer data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The treatment of factual data has been widely studied in different areas of Natural Language Processing (NLP). However, processing subjective information still poses important challenges. This paper presents research aimed at assessing techniques that have been suggested as appropriate in the context of subjective - Opinion Question Answering (OQA). We evaluate the performance of an OQA with these new components and propose methods to optimally tackle the issues encountered. We assess the impact of including additional resources and processes with the purpose of improving the system performance on two distinct blog datasets. The improvements obtained for the different combination of tools are statistically significant. We thus conclude that the proposed approach is adequate for the OQA task, offering a good strategy to deal with opinionated questions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, a proposal of a multi-modal dialogue system oriented to multilingual question-answering is presented. This system includes the following ways of access: voice, text, avatar, gestures and signs language. The proposal is oriented to the question-answering task as a user interaction mechanism. The proposal here presented is in the first stages of its development phase and the architecture is presented for the first time on the base of the experiences in question-answering and dialogues previously developed. The main objective of this research work is the development of a solid platform that will permit the modular integration of the proposed architecture.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we present a complete system for the treatment of both geographical and temporal dimensions in text and its application to information retrieval. This system has been evaluated in both the GeoTime task of the 8th and 9th NTCIR workshop in the years 2010 and 2011 respectively, making it possible to compare the system to contemporary approaches to the topic. In order to participate in this task we have added the temporal dimension to our GIR system. The system proposed here has a modular architecture in order to add or modify features. In the development of this system, we have followed a QA-based approach as well as multi-search engines to improve the system performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper a multilingual method for event ordering based on temporal expression resolution is presented. This method has been implemented through the TERSEO system which consists of three main units: temporal expression recognizing, resolution of the coreference introduced by these expressions, and event ordering. By means of this system, chronological information related to events can be extracted from documental databases. This information is automatically added to the documental database in order to allow its use by question answering systems in those cases referring to temporality. The system has been evaluated obtaining results of 91 % precision and 71 % recall. For this, a blind evaluation process has been developed guaranteing a reliable annotation process that was measured through the kappa factor.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Los sistemas de búsqueda de respuestas (BR) se pueden considerar como potenciales sucesores de los buscadores tradicionales de información en la Web. Para que sean precisos deben adaptarse a dominios concretos mediante el uso de recursos semánticos adecuados. La adaptación no es una tarea trivial, ya que deben integrarse e incorporarse a sistemas de BR existentes varios recursos heterogéneos relacionados con un dominio restringido. Se presenta la herramienta Maraqa, cuya novedad radica en el uso de técnicas de ingeniería del software, como el desarrollo dirigido por modelos, para automatizar dicho proceso de adaptación a dominios restringidos. Se ha evaluado Maraqa mediante una serie de experimentos (sobre el dominio agrícola) que demuestran su viabilidad, mejorando en un 29,5% la precisión del sistema adaptado.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

El campo de procesamiento de lenguaje natural (PLN), ha tenido un gran crecimiento en los últimos años; sus áreas de investigación incluyen: recuperación y extracción de información, minería de datos, traducción automática, sistemas de búsquedas de respuestas, generación de resúmenes automáticos, análisis de sentimientos, entre otras. En este artículo se presentan conceptos y algunas herramientas con el fin de contribuir al entendimiento del procesamiento de texto con técnicas de PLN, con el propósito de extraer información relevante que pueda ser usada en un gran rango de aplicaciones. Se pueden desarrollar clasificadores automáticos que permitan categorizar documentos y recomendar etiquetas; estos clasificadores deben ser independientes de la plataforma, fácilmente personalizables para poder ser integrados en diferentes proyectos y que sean capaces de aprender a partir de ejemplos. En el presente artículo se introducen estos algoritmos de clasificación, se analizan algunas herramientas de código abierto disponibles actualmente para llevar a cabo estas tareas y se comparan diversas implementaciones utilizando la métrica F en la evaluación de los clasificadores.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Decision support systems (DSS) support business or organizational decision-making activities, which require the access to information that is internally stored in databases or data warehouses, and externally in the Web accessed by Information Retrieval (IR) or Question Answering (QA) systems. Graphical interfaces to query these sources of information ease to constrain dynamically query formulation based on user selections, but they present a lack of flexibility in query formulation, since the expressivity power is reduced to the user interface design. Natural language interfaces (NLI) are expected as the optimal solution. However, especially for non-expert users, a real natural communication is the most difficult to realize effectively. In this paper, we propose an NLI that improves the interaction between the user and the DSS by means of referencing previous questions or their answers (i.e. anaphora such as the pronoun reference in “What traits are affected by them?”), or by eliding parts of the question (i.e. ellipsis such as “And to glume colour?” after the question “Tell me the QTLs related to awn colour in wheat”). Moreover, in order to overcome one of the main problems of NLIs about the difficulty to adapt an NLI to a new domain, our proposal is based on ontologies that are obtained semi-automatically from a framework that allows the integration of internal and external, structured and unstructured information. Therefore, our proposal can interface with databases, data warehouses, QA and IR systems. Because of the high NL ambiguity of the resolution process, our proposal is presented as an authoring tool that helps the user to query efficiently in natural language. Finally, our proposal is tested on a DSS case scenario about Biotechnology and Agriculture, whose knowledge base is the CEREALAB database as internal structured data, and the Web (e.g. PubMed) as external unstructured information.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper presents the evaluation of a QA system for the treatment of complex temporal questions. The system was implemented in a multilayered architecture where complex temporal questions are first decomposed into simple questions, according to the temporal relations expressed in the original question. These simple questions are then processed independently by our standard Question Answering engine and their respective answers are filtered to satisfy the temporal restrictions of each simple question. The answers to the simple decomposed questions are then combined, according to the temporal relations extracted from the original complex question, to give the final answer. This evaluation was performed as a pilot task in the Spanish QA Track of the Cross Language Evaluation Forum 2004.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The Answer Validation Exercise (AVE) is a pilot track within the Cross-Language Evaluation Forum (CLEF) 2006. The AVE competition provides an evaluation frame- work for answer validations in Question Answering (QA). In our participation in AVE, we propose a system that has been initially used for other task as Recognising Textual Entailment (RTE). The aim of our participation is to evaluate the improvement our system brings to QA. Moreover, due to the fact that these two task (AVE and RTE) have the same main idea, which is to find semantic implications between two fragments of text, our system has been able to be directly applied to the AVE competition. Our system is based on the representation of the texts by means of logic forms and the computation of semantic comparison between them. This comparison is carried out using two different approaches. The first one managed by a deeper study of the Word- Net relations, and the second uses the measure defined by Lin in order to compute the semantic similarity between the logic form predicates. Moreover, we have also designed a voting strategy between our system and the MLEnt system, also presented by the University of Alicante, with the aim of obtaining a joint execution of the two systems developed at the University of Alicante. Although the results obtained have not been very high, we consider that they are quite promising and this supports the fact that there is still a lot of work on researching in any kind of textual entailment.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The goal of the project is to analyze, experiment, and develop intelligent, interactive and multilingual Text Mining technologies, as a key element of the next generation of search engines, systems with the capacity to find "the need behind the query". This new generation will provide specialized services and interfaces according to the search domain and type of information needed. Moreover, it will integrate textual search (websites) and multimedia search (images, audio, video), it will be able to find and organize information, rather than generating ranked lists of websites.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Automatic Text Summarization has been shown to be useful for Natural Language Processing tasks such as Question Answering or Text Classification and other related fields of computer science such as Information Retrieval. Since Geographical Information Retrieval can be considered as an extension of the Information Retrieval field, the generation of summaries could be integrated into these systems by acting as an intermediate stage, with the purpose of reducing the document length. In this manner, the access time for information searching will be improved, while at the same time relevant documents will be also retrieved. Therefore, in this paper we propose the generation of two types of summaries (generic and geographical) applying several compression rates in order to evaluate their effectiveness in the Geographical Information Retrieval task. The evaluation has been carried out using GeoCLEF as evaluation framework and following an Information Retrieval perspective without considering the geo-reranking phase commonly used in these systems. Although single-document summarization has not performed well in general, the slight improvements obtained for some types of the proposed summaries, particularly for those based on geographical information, made us believe that the integration of Text Summarization with Geographical Information Retrieval may be beneficial, and consequently, the experimental set-up developed in this research work serves as a basis for further investigations in this field.