911 resultados para Multilingual question-answering


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, a proposal of a multi-modal dialogue system oriented to multilingual question-answering is presented. This system includes the following ways of access: voice, text, avatar, gestures and signs language. The proposal is oriented to the question-answering task as a user interaction mechanism. The proposal here presented is in the first stages of its development phase and the architecture is presented for the first time on the base of the experiences in question-answering and dialogues previously developed. The main objective of this research work is the development of a solid platform that will permit the modular integration of the proposed architecture.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A Teia Mundial (Web) foi prevista como uma rede de documentos de hipertexto interligados de forma a criar uma espaço de informação onde humanos e máquinas poderiam comunicar. No entanto, a informação contida na Web tradicional foi/é armazenada de forma não estruturada o que leva a que apenas os humanos a possam consumir convenientemente. Consequentemente, a procura de informações na Web sintáctica é uma tarefa principalmente executada pelos humanos e nesse sentido nem sempre é fácil de concretizar. Neste contexto, tornou-se essencial a evolução para uma Web mais estruturada e mais significativa onde é dado significado bem definido à informação de forma a permitir a cooperação entre humanos e máquinas. Esta Web é usualmente referida como Web Semântica. Além disso, a Web Semântica é totalmente alcançável apenas se os dados de diferentes fontes forem ligados criando assim um repositório de Dados Abertos Ligados (LOD). Com o aparecimento de uma nova Web de Dados (Abertos) Ligados (i.e. a Web Semântica), novas oportunidades e desafios surgiram. Pergunta Resposta (QA) sobre informação semântica é actualmente uma área de investigação activa que tenta tirar vantagens do uso das tecnologias ligadas à Web Semântica para melhorar a tarefa de responder a questões. O principal objectivo do projecto World Search passa por explorar a Web Semântica para criar mecanismos que suportem os utilizadores de domínios de aplicação específicos a responder a questões complexas com base em dados oriundos de diferentes repositórios. No entanto, a avaliação feita ao estado da arte permite concluir que as aplicações existentes não suportam os utilizadores na resposta a questões complexas. Nesse sentido, o trabalho desenvolvido neste documento foca-se em estudar/desenvolver metodologias/processos que permitam ajudar os utilizadores a encontrar respostas exactas/corretas para questões complexas que não podem ser respondidas fazendo uso dos sistemas tradicionais. Tal inclui: (i) Ultrapassar a dificuldade dos utilizadores visionarem o esquema subjacente aos repositórios de conhecimento; (ii) Fazer a ponte entre a linguagem natural expressa pelos utilizadores e a linguagem (formal) entendível pelos repositórios; (iii) Processar e retornar informações relevantes que respondem apropriadamente às questões dos utilizadores. Para esse efeito, são identificadas um conjunto de funcionalidades que são consideradas necessárias para suportar o utilizador na resposta a questões complexas. É também fornecida uma descrição formal dessas funcionalidades. A proposta é materializada num protótipo que implementa as funcionalidades previamente descritas. As experiências realizadas com o protótipo desenvolvido demonstram que os utilizadores efectivamente beneficiam das funcionalidades apresentadas: ▪ Pois estas permitem que os utilizadores naveguem eficientemente sobre os repositórios de informação; ▪ O fosso entre as conceptualizações dos diferentes intervenientes é minimizado; ▪ Os utilizadores conseguem responder a questões complexas que não conseguiam responder com os sistemas tradicionais. Em suma, este documento apresenta uma proposta que comprovadamente permite, de forma orientada pelo utilizador, responder a questões complexas em repositórios semiestruturados.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This is a Named Entity Based Question Answering System for Malayalam Language. Although a vast amount of information is available today in digital form, no effective information access mechanism exists to provide humans with convenient information access. Information Retrieval and Question Answering systems are the two mechanisms available now for information access. Information systems typically return a long list of documents in response to a user’s query which are to be skimmed by the user to determine whether they contain an answer. But a Question Answering System allows the user to state his/her information need as a natural language question and receives most appropriate answer in a word or a sentence or a paragraph. This system is based on Named Entity Tagging and Question Classification. Document tagging extracts useful information from the documents which will be used in finding the answer to the question. Question Classification extracts useful information from the question to determine the type of the question and the way in which the question is to be answered. Various Machine Learning methods are used to tag the documents. Rule-Based Approach is used for Question Classification. Malayalam belongs to the Dravidian family of languages and is one of the four major languages of this family. It is one of the 22 Scheduled Languages of India with official language status in the state of Kerala. It is spoken by 40 million people. Malayalam is a morphologically rich agglutinative language and relatively of free word order. Also Malayalam has a productive morphology that allows the creation of complex words which are often highly ambiguous. Document tagging tools such as Parts-of-Speech Tagger, Phrase Chunker, Named Entity Tagger, and Compound Word Splitter are developed as a part of this research work. No such tools were available for Malayalam language. Finite State Transducer, High Order Conditional Random Field, Artificial Immunity System Principles, and Support Vector Machines are the techniques used for the design of these document preprocessing tools. This research work describes how the Named Entity is used to represent the documents. Single sentence questions are used to test the system. Overall Precision and Recall obtained are 88.5% and 85.9% respectively. This work can be extended in several directions. The coverage of non-factoid questions can be increased and also it can be extended to include open domain applications. Reference Resolution and Word Sense Disambiguation techniques are suggested as the future enhancements

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The importance of the new textual genres such as blogs or forum entries is growing in parallel with the evolution of the Social Web. This paper presents two corpora of blog posts in English and in Spanish, annotated according to the EmotiBlog annotation scheme. Furthermore, we created 20 factual and opinionated questions for each language and also the Gold Standard for their answers in the corpus. The purpose of our work is to study the challenges involved in a mixed fact and opinion question answering setting by comparing the performance of two Question Answering (QA) systems as far as mixed opinion and factual setting is concerned. The first one is open domain, while the second one is opinion-oriented. We evaluate separately the two systems in both languages and propose possible solutions to improve QA systems that have to process mixed questions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The development of the Web 2.0 led to the birth of new textual genres such as blogs, reviews or forum entries. The increasing number of such texts and the highly diverse topics they discuss make blogs a rich source for analysis. This paper presents a comparative study on open domain and opinion QA systems. A collection of opinion and mixed fact-opinion questions in English is defined and two Question Answering systems are employed to retrieve the answers to these queries. The first one is generic, while the second is specific for emotions. We comparatively evaluate and analyze the systems’ results, concluding that opinion Question Answering requires the use of specific resources and methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a multi-layered Question Answering (Q.A.) architecture suitable for enhancing current Q.A. capabilities with the possibility of processing complex questions. That is, questions whose answer needs to be gathered from pieces of factual information scattered in different documents. Specifically, we have designed a layer oriented to process the different types of temporal questions. Complex temporal questions are first decomposed into simpler ones, according to the temporal relationships expressed in the original question. In the same way, the answers of each simple question are re-composed, fulfilling the temporal restrictions of the original complex question. Using this architecture, a Temporal Q.A. system has been developed. In this paper, we focus on explaining the first part of the process: the decomposition of the complex questions. Furthermore, it has been evaluated with the TERQAS question corpus of 112 temporal questions. For the task of question splitting our system has performed, in terms of precision and recall, 85% and 71%, respectively.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The treatment of factual data has been widely studied in different areas of Natural Language Processing (NLP). However, processing subjective information still poses important challenges. This paper presents research aimed at assessing techniques that have been suggested as appropriate in the context of subjective - Opinion Question Answering (OQA). We evaluate the performance of an OQA with these new components and propose methods to optimally tackle the issues encountered. We assess the impact of including additional resources and processes with the purpose of improving the system performance on two distinct blog datasets. The improvements obtained for the different combination of tools are statistically significant. We thus conclude that the proposed approach is adequate for the OQA task, offering a good strategy to deal with opinionated questions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we present a complete system for the treatment of both geographical and temporal dimensions in text and its application to information retrieval. This system has been evaluated in both the GeoTime task of the 8th and 9th NTCIR workshop in the years 2010 and 2011 respectively, making it possible to compare the system to contemporary approaches to the topic. In order to participate in this task we have added the temporal dimension to our GIR system. The system proposed here has a modular architecture in order to add or modify features. In the development of this system, we have followed a QA-based approach as well as multi-search engines to improve the system performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Currently there are an overwhelming number of scientific publications in Life Sciences, especially in Genetics and Biotechnology. This huge amount of information is structured in corporate Data Warehouses (DW) or in Biological Databases (e.g. UniProt, RCSB Protein Data Bank, CEREALAB or GenBank), whose main drawback is its cost of updating that makes it obsolete easily. However, these Databases are the main tool for enterprises when they want to update their internal information, for example when a plant breeder enterprise needs to enrich its genetic information (internal structured Database) with recently discovered genes related to specific phenotypic traits (external unstructured data) in order to choose the desired parentals for breeding programs. In this paper, we propose to complement the internal information with external data from the Web using Question Answering (QA) techniques. We go a step further by providing a complete framework for integrating unstructured and structured information by combining traditional Databases and DW architectures with QA systems. The great advantage of our framework is that decision makers can compare instantaneously internal data with external data from competitors, thereby allowing taking quick strategic decisions based on richer data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Thesis (M.S.)--Illinois.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With the recent rapid growth of the Semantic Web (SW), the processes of searching and querying content that is both massive in scale and heterogeneous have become increasingly challenging. User-friendly interfaces, which can support end users in querying and exploring this novel and diverse, structured information space, are needed to make the vision of the SW a reality. We present a survey on ontology-based Question Answering (QA), which has emerged in recent years to exploit the opportunities offered by structured semantic information on the Web. First, we provide a comprehensive perspective by analyzing the general background and history of the QA research field, from influential works from the artificial intelligence and database communities developed in the 70s and later decades, through open domain QA stimulated by the QA track in TREC since 1999, to the latest commercial semantic QA solutions, before tacking the current state of the art in open user-friendly interfaces for the SW. Second, we examine the potential of this technology to go beyond the current state of the art to support end-users in reusing and querying the SW content. We conclude our review with an outlook for this novel research area, focusing in particular on the R&D directions that need to be pursued to realize the goal of efficient and competent retrieval and integration of answers from large scale, heterogeneous, and continuously evolving semantic sources.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Linked Data semantic sources, in particular DBpedia, can be used to answer many user queries. PowerAqua is an open multi-ontology Question Answering (QA) system for the Semantic Web (SW). However, the emergence of Linked Data, characterized by its openness, heterogeneity and scale, introduces a new dimension to the Semantic Web scenario, in which exploiting the relevant information to extract answers for Natural Language (NL) user queries is a major challenge. In this paper we discuss the issues and lessons learned from our experience of integrating PowerAqua as a front-end for DBpedia and a subset of Linked Data sources. As such, we go one step beyond the state of the art on end-users interfaces for Linked Data by introducing mapping and fusion techniques needed to translate a user query by means of multiple sources. Our first informal experiments probe whether, in fact, it is feasible to obtain answers to user queries by composing information across semantic sources and Linked Data, even in its current form, where the strength of Linked Data is more a by-product of its size than its quality. We believe our experiences can be extrapolated to a variety of end-user applications that wish to scale, open up, exploit and re-use what possibly is the greatest wealth of data about everything in the history of Artificial Intelligence. © 2010 Springer-Verlag.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The semantic web vision is one in which rich, ontology-based semantic markup will become widely available. The availability of semantic markup on the web opens the way to novel, sophisticated forms of question answering. AquaLog is a portable question-answering system which takes queries expressed in natural language and an ontology as input, and returns answers drawn from one or more knowledge bases (KBs). We say that AquaLog is portable because the configuration time required to customize the system for a particular ontology is negligible. AquaLog presents an elegant solution in which different strategies are combined together in a novel way. It makes use of the GATE NLP platform, string metric algorithms, WordNet and a novel ontology-based relation similarity service to make sense of user queries with respect to the target KB. Moreover it also includes a learning component, which ensures that the performance of the system improves over the time, in response to the particular community jargon used by end users.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The semantic web (SW) vision is one in which rich, ontology-based semantic markup will become widely available. The availability of semantic markup on the web opens the way to novel, sophisticated forms of question answering. AquaLog is a portable question-answering system which takes queries expressed in natural language (NL) and an ontology as input, and returns answers drawn from one or more knowledge bases (KB). AquaLog presents an elegant solution in which different strategies are combined together in a novel way. AquaLog novel ontology-based relation similarity service makes sense of user queries.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Mobile messaging is an integral and vital part of the mobile industry and contributes significantly to worldwide total mobile service revenues. In today’s competitive world, differentiation is a significant factor in the success of the business communication. SMS (Short Message Service) provides a powerful vehicle for service differentiation. What is missing, however, is the availability of personalized SMS messages. In particular, the exploitation of user profile information allows a selection and content delivery that meets preferences and interests for the individual. Personalization of mobile messages is important in today’s service-oriented society, and has proven to be crucial for the acceptance of services provided by the mobile telecommunication networks. In this paper we focus on user profile description and the mechanism for delivering the relevant information to the mobile user in accordance with his/her profile.