956 resultados para Question-answering systems


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Automatic Text Summarization has been shown to be useful for Natural Language Processing tasks such as Question Answering or Text Classification and other related fields of computer science such as Information Retrieval. Since Geographical Information Retrieval can be considered as an extension of the Information Retrieval field, the generation of summaries could be integrated into these systems by acting as an intermediate stage, with the purpose of reducing the document length. In this manner, the access time for information searching will be improved, while at the same time relevant documents will be also retrieved. Therefore, in this paper we propose the generation of two types of summaries (generic and geographical) applying several compression rates in order to evaluate their effectiveness in the Geographical Information Retrieval task. The evaluation has been carried out using GeoCLEF as evaluation framework and following an Information Retrieval perspective without considering the geo-reranking phase commonly used in these systems. Although single-document summarization has not performed well in general, the slight improvements obtained for some types of the proposed summaries, particularly for those based on geographical information, made us believe that the integration of Text Summarization with Geographical Information Retrieval may be beneficial, and consequently, the experimental set-up developed in this research work serves as a basis for further investigations in this field.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Online communities are prime sources of information. The Web is rich with forums and Question Answering (Q&A) communities where people go to seek answers to all kinds of questions. Most systems employ manual answer-rating procedures to encourage people to provide quality answers and to help users locate the best answers in a given thread. However, in the datasets we collected from three online communities, we found that half their threads lacked best answer markings. This stresses the need for methods to assess the quality of available answers to: 1) provide automated ratings to fill in for, or support, manually assigned ones, and; 2) to assist users when browsing such answers by filtering in potential best answers. In this paper, we collected data from three online communities and converted it to RDF based on the SIOC ontology. We then explored an approach for predicting best answers using a combination of content, user, and thread features. We show how the influence of such features on predicting best answers differs across communities. Further we demonstrate how certain features unique to some of our community systems can boost predictability of best answers.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

PowerAqua is a Question Answering system, which takes as input a natural language query and is able to return answers drawn from relevant semantic resources found anywhere on the Semantic Web. In this paper we provide two novel contributions: First, we detail a new component of the system, the Triple Similarity Service, which is able to match queries effectively to triples found in different ontologies on the Semantic Web. Second, we provide a first evaluation of the system, which in addition to providing data about PowerAqua's competence, also gives us important insights into the issues related to using the Semantic Web as the target answer set in Question Answering. In particular, we show that, despite the problems related to the noisy and incomplete conceptualizations, which can be found on the Semantic Web, good results can already be obtained.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The Semantic Web (SW) offers an opportunity to develop novel, sophisticated forms of question answering (QA). Specifically, the availability of distributed semantic markup on a large scale opens the way to QA systems which can make use of such semantic information to provide precise, formally derived answers to questions. At the same time the distributed, heterogeneous, large-scale nature of the semantic information introduces significant challenges. In this paper we describe the design of a QA system, PowerAqua, designed to exploit semantic markup on the web to provide answers to questions posed in natural language. PowerAqua does not assume that the user has any prior information about the semantic resources. The system takes as input a natural language query, translates it into a set of logical queries, which are then answered by consulting and aggregating information derived from multiple heterogeneous semantic sources.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Online enquiry communities such as Question Answering (Q&A) websites allow people to seek answers to all kind of questions. With the growing popularity of such platforms, it is important for community managers to constantly monitor the performance of their communities. Although different metrics have been proposed for tracking the evolution of such communities, maturity, the process in which communities become more topic proficient over time, has been largely ignored despite its potential to help in identifying robust communities. In this paper, we interpret community maturity as the proportion of complex questions in a community at a given time. We use the Server Fault (SF) community, a Question Answering (Q&A) community of system administrators, as our case study and perform analysis on question complexity, the level of expertise required to answer a question. We show that question complexity depends on both the length of involvement and the level of contributions of the users who post questions within their community. We extract features relating to askers, answerers, questions and answers, and analyse which features are strongly correlated with question complexity. Although our findings highlight the difficulty of automatically identifying question complexity, we found that complexity is more influenced by both the topical focus and the length of community involvement of askers. Following the identification of question complexity, we define a measure of maturity and analyse the evolution of different topical communities. Our results show that different topical communities show different maturity patterns. Some communities show a high maturity at the beginning while others exhibit slow maturity rate. Copyright 2013 ACM.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Value of online Question Answering (QandA) communities is driven by the question-answering behaviour of its members. Finding the questions that members are willing to answer is therefore vital to the effcient operation of such communities. In this paper, we aim to identify the parameters that cor- relate with such behaviours. We train different models and construct effective predictions using various user, question and thread feature sets. We show that answering behaviour can be predicted with a high level of success.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

[EU]Hizkuntzaren prozesamenduan testu koherenteetan kausa taldeko erlazioak (KAUSA, ONDORIOA eta HELBURUA) automatikoki hautematea eta bereiztea erabilgarria da galdera-erantzun automatikoko sistemak eraikitzerako orduan. Horretarako Egitura Erretorikoaren Teoria (Rhetorical Structure Theory, aurrerantzean RST) eta bere erlazioak erabiliko ditugu, corpus bezala RST Treebank -a (Iruskieta et al., 2013) hartuta, zientziako laburpen-testuz osatutako corpusa, hain zuzen ere. Corpus hori XML formatuan deskargatu eta hortik XPATH tresnaren bidez informazio garrantzitsuena eskuratzen dugu. Lan honek 3 helburu nagusi ditu: lehendabizi, kausa taldeko erlazioak elkarren artean bereiztea, bigarrenez, kausa taldeko erlazio hauek beste erlazio guztiekin bereiztea, eta azkenik, EBALUAZIOA eta INTERPRETAZIOA erlazioak bereiztea sentimendu analisian aplikatu ahal izateko. Ataza horiek egiteko, RhetDB tresnarekin eskuratu diren patroi ensaguratsuenak erabili eta bi aplikazio garatu ditugu. Alde batetik, bilatu nahi ditugun patroiak adierazi eta erlazio-egitura duen edonolako testuetan bilaketak egiten dituen bilatzailea, eta bestetik, patroi esanguratsuenak emanda erlazioak etiketatzen dituen etiketatzailea. Bi aplikazio hauek gainera, ahalik eta modu parametrizagarrienean erabiltzeko garatu ditugu, kodea aldatu gabe edonork erabili ahal izateko antzeko atazak egiteko. Etiketatzaileak ebaluatu ondoren, identifikatzeko erlaziorik errazena HELBURUA erlazioa dela ikusi dugu eta KAUSA eta ONDORIOA bereizteko arazo gehiago dauzkagula ere ondorioztatu dugu. Modu berean, EBALUAZIOA eta INTERPRETAZIOA ere elkarren artean bereiz dezakegula ikusi dugu.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Stack Overflow is a highly successful Community Question Answering (CQA) service for software developers with more than three millions users and more than ten thousand posts per day. The large volume of questions makes it difficult for users to find questions that they are interested in answering. In this paper, we propose a number of approaches to predict who will answer a new question using the characteristics of the question (i.e. Topic) and users (i.e. Reputation), and the social network of Stack Overflow users (i.e. Interested in the same topic). Specifically, our approach aims to identify a group of users (candidates) who have the potential to answer a new question by using feature-based prediction approach and social network based prediction approach. We develop predictive models to predict whether an identified candidate answers a new question. This prediction helps motivate the knowledge exchanging in the community by routing relevant questions to potential answerers. The evaluation results demonstrate the effectiveness of our predictive models, achieving 44% precision, 59% recall, and 49% F-measure (average across all test sets). In addition, our candidate identification techniques can identify the answerers who actually answer questions up to 12.8% (average across all test sets).

Relevância:

80.00% 80.00%

Publicador:

Resumo:

At NTCIR-9, we participated in the cross-lingual link discovery (Crosslink) task. In this paper we describe our approaches to discovering Chinese, Japanese, and Korean (CJK) cross-lingual links for English documents in Wikipedia. Our experimental results show that a link mining approach that mines the existing link structure for anchor probabilities and relies on the “translation” using cross-lingual document name triangulation performs very well. The evaluation shows encouraging results for our system.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

INEX investigates focused retrieval from structured documents by providing large test collections of structured documents, uniform evaluation measures, and a forum for organizations to compare their results. This paper reports on the INEX 2011 evaluation campaign, which consisted of a five active tracks: Books and Social Search, Data Centric, Question Answering, Relevance Feedback, and Snippet Retrieval. INEX 2011 saw a range of new tasks and tracks, such as Social Book Search, Faceted Search, Snippet Retrieval, and Tweet Contextualization.