943 resultados para Question-answering systems
Resumo:
[EN]Measuring semantic similarity and relatedness between textual items (words, sentences, paragraphs or even documents) is a very important research area in Natural Language Processing (NLP). In fact, it has many practical applications in other NLP tasks. For instance, Word Sense Disambiguation, Textual Entailment, Paraphrase detection, Machine Translation, Summarization and other related tasks such as Information Retrieval or Question Answering. In this masther thesis we study di erent approaches to compute the semantic similarity between textual items. In the framework of the european PATHS project1, we also evaluate a knowledge-base method on a dataset of cultural item descriptions. Additionaly, we describe the work carried out for the Semantic Textual Similarity (STS) shared task of SemEval-2012. This work has involved supporting the creation of datasets for similarity tasks, as well as the organization of the task itself.
Resumo:
The STUDENT problem solving system, programmed in LISP, accepts as input a comfortable but restricted subset of English which can express a wide variety of algebra story problems. STUDENT finds the solution to a large class of these problems. STUDENT can utilize a store of global information not specific to any one problem, and may make assumptions about the interpretation of ambiguities in the wording of the problem being solved. If it uses such information or makes any assumptions, STUDENT communicates this fact to the user. The thesis includes a summary of other English language questions-answering systems. All these systems, and STUDENT, are evaluated according to four standard criteria. The linguistic analysis in STUDENT is a first approximation to the analytic portion of a semantic theory of discourse outlined in the thesis. STUDENT finds the set of kernel sentences which are the base of the input discourse, and transforms this sequence of kernel sentences into a set of simultaneous equations which form the semantic base of the STUDENT system. STUDENT then tries to solve this set of equations for the values of requested unknowns. If it is successful it gives the answers in English. If not, STUDENT asks the user for more information, and indicates the nature of the desired information. The STUDENT system is a first step toward natural language communication with computers. Further work on the semantic theory proposed should result in much more sophisticated systems.
Resumo:
We compare the effect of different text segmentation strategies on speech based passage retrieval of video. Passage retrieval has mainly been studied to improve document retrieval and to enable question answering. In these domains best results were obtained using passages defined by the paragraph structure of the source documents or by using arbitrary overlapping passages. For the retrieval of relevant passages in a video, using speech transcripts, no author defined segmentation is available. We compare retrieval results from 4 different types of segments based on the speech channel of the video: fixed length segments, a sliding window, semantically coherent segments and prosodic segments. We evaluated the methods on the corpus of the MediaEval 2011 Rich Speech Retrieval task. Our main conclusion is that the retrieval results highly depend on the right choice for the segment length. However, results using the segmentation into semantically coherent parts depend much less on the segment length. Especially, the quality of fixed length and sliding window segmentation drops fast when the segment length increases, while quality of the semantically coherent segments is much more stable. Thus, if coherent segments are defined, longer segments can be used and consequently less segments have to be considered at retrieval time.
Resumo:
Dissertação de natureza científica realizada para obtenção do grau de Mestre em Engenharia Informática e de Computadores
Resumo:
This experimental study examined the effects of cooperative learning and a question-answering strategy called elaborative interrogation ("Why is this fact true?") on the learning of factual information about familiar animals. Retention gains were compared across four study conditions: elaborative-interrogation-plus-cooperative learning, cooperative-learning, elaborative-interrogation, and reading-control. Sixth-grade students (n=68) were randomly assigned to the four conditions. All participants were given initial training and practice in cooperative learning procedures via three 45-minute sessions. After studying 36 facts about six animals, students' retention gains were measured via immediate free recall, immediate matched association, 30-day, and GO-day matched association tests. A priori comparisons were made to analyze the data. For immediate free recall and immediate matched association, significant differences were found between students in the three experimental conditions versus those in the control condition. Elaborative-interrogation and elaborativeinterrogation- plus-cooperative-learning also promoted longterm retention (measured via 30-day matched association) of the material relative to repetitive reading with elaborative-interrogation promoting the most durable gains (measured via GO-day matched association). The relationship between the types of elaborative responses and probability of subsequent retention was also examined. Even when students were unable to provide adequate answers to the why questions, learning was facilitated more so than repetitive reading. In general, generation of adequate elaborations was associated with greater probability of recall than was provision of inadequate answers. The findings of the study demonstrate that cooperative learning and the use of elaborative interrogation, both individually and collaboratively, are effective classroom procedures for facilitating children's learning of new information.
Resumo:
This study examined the efficacy of providing four Grade 7 and 8 students with reading difficulties with explicit instruction in the use of reading comprehension strategies while using text-reader software. Specifically, the study explored participants' combined use of a text-reader and question-answering comprehension strategy during a 6-week instructional program. Using a qualitative case study methodology approach, participants' experiences using text-reader software, with the presence of explicit instruction in evidence-based reading comprehension strategies, were examined. The study involved three phases: (a) the first phase consisted of individual interviews with the participants and their parents; (b) the second phase consisted of a nine session course; and (c) the third phase consisted of individual exit interviews and a focus group discussion. After the data collection phases were completed, data were analyzed and coded for emerging themes, with-quantitativ,e measures of participants' reading performance used as descriptive data. The data suggested that assistive technology can serve as an instructional "hook", motivating students to engage actively in the reading processes, especially when accompanied by explicit strategy instruction. Participants' experiences also reflected development of strategy use and use of text-reader software and the importance of social interactions in developing reading comprehension skills. The findings of this study support the view that the integration of instruction using evidence-based practices are important and vital components in the inclusion oftext-reader software as part of students' educational programming. Also, the findings from this study can be extended to develop in-class programming for students using text-reader software.
Resumo:
O presente trabalho tem por objetivo subsidiar o investidor de Fundos de Investimento Imobiliário na escolha de uma carteira de aplicação de FIIs, visando obter performance igual ou superior ao índice de referência do setor (IFIX). Tal subsídio é constituído, inicialmente, por uma metodologia que considera que o conceito de Carteira Eficiente (Risco/Retorno) preconizada por Markowitz pode trabalhar em conjunto com a dimensão do conceito das Finanças Comportamentais, liderada por Daniel Kahneman, constituindo as bases de orientação do investidor. Acrescentamos o caminho metodológico com as indicações, sugeridas por Bazerman e Moore, no processo de tomada de decisão, que reduza os efeitos de heurísticas e vieses.
Resumo:
This paper introduces a semantic language developed with the objective to be used in a semantic analyzer based on linguistic and world knowledge. Linguistic knowledge is provided by a Combinatorial Dictionary and several sets of rules. Extra-linguistic information is stored in an Ontology. The meaning of the text is represented by means of a series of RDF-type triples of the form predicate (subject, object). Semantic analyzer is one of the options of the multifunctional ETAP-3 linguistic processor. The analyzer can be used for Information Extraction and Question Answering. We describe semantic representation of expressions that provide an assessment of the number of objects involved and/or give a quantitative evaluation of different types of attributes. We focus on the following aspects: 1) parametric and non-parametric attributes; 2) gradable and non-gradable attributes; 3) ontological representation of different classes of attributes; 4) absolute and relative quantitative assessment; 5) punctual and interval quantitative assessment; 6) intervals with precise and fuzzy boundaries
Resumo:
The present is marked by the availability of large volumes of heterogeneous data, whose management is extremely complex. While the treatment of factual data has been widely studied, the processing of subjective information still poses important challenges. This is especially true in tasks that combine Opinion Analysis with other challenges, such as the ones related to Question Answering. In this paper, we describe the different approaches we employed in the NTCIR 8 MOAT monolingual English (opinionatedness, relevance, answerness and polarity) and cross-lingual English-Chinese tasks, implemented in our OpAL system. The results obtained when using different settings of the system, as well as the error analysis performed after the competition, offered us some clear insights on the best combination of techniques, that balance between precision and recall. Contrary to our initial intuitions, we have also seen that the inclusion of specialized Natural Language Processing tools dealing with Temporality or Anaphora Resolution lowers the system performance, while the use of topic detection techniques using faceted search with Wikipedia and Latent Semantic Analysis leads to satisfactory system performance, both for the monolingual setting, as well as in a multilingual one.
Resumo:
This paper presents a preliminary study in which Machine Learning experiments applied to Opinion Mining in blogs have been carried out. We created and annotated a blog corpus in Spanish using EmotiBlog. We evaluated the utility of the features labelled firstly carrying out experiments with combinations of them and secondly using the feature selection techniques, we also deal with several problems, such as the noisy character of the input texts, the small size of the training set, the granularity of the annotation scheme and the language object of our study, Spanish, with less resource than English. We obtained promising results considering that it is a preliminary study.
Resumo:
In this paper we propose algorithms for combining and ranking answers from distributed heterogeneous data sources in the context of a multi-ontology Question Answering task. Our proposal includes a merging algorithm that aggregates, combines and filters ontology-based search results and three different ranking algorithms that sort the final answers according to different criteria such as popularity, confidence and semantic interpretation of results. An experimental evaluation on a large scale corpus indicates improvements in the quality of the search results with respect to a scenario where the merging and ranking algorithms were not applied. These collective methods for merging and ranking allow to answer questions that are distributed across ontologies, while at the same time, they can filter irrelevant answers, fuse similar answers together, and elicit the most accurate answer(s) to a question.
Resumo:
While semantic search technologies have been proven to work well in specific domains, they still have to confront two main challenges to scale up to the Web in its entirety. In this work we address this issue with a novel semantic search system that a) provides the user with the capability to query Semantic Web information using natural language, by means of an ontology-based Question Answering (QA) system [14] and b) complements the specific answers retrieved during the QA process with a ranked list of documents from the Web [3]. Our results show that ontology-based semantic search capabilities can be used to complement and enhance keyword search technologies.
Resumo:
In this paper, we explore the idea of social role theory (SRT) and propose a novel regularized topic model which incorporates SRT into the generative process of social media content. We assume that a user can play multiple social roles, and each social role serves to fulfil different duties and is associated with a role-driven distribution over latent topics. In particular, we focus on social roles corresponding to the most common social activities on social networks. Our model is instantiated on microblogs, i.e., Twitter and community question-answering (cQA), i.e., Yahoo! Answers, where social roles on Twitter include "originators" and "propagators", and roles on cQA are "askers" and "answerers". Both explicit and implicit interactions between users are taken into account and modeled as regularization factors. To evaluate the performance of our proposed method, we have conducted extensive experiments on two Twitter datasets and two cQA datasets. Furthermore, we also consider multi-role modeling for scientific papers where an author's research expertise area is considered as a social role. A novel application of detecting users' research interests through topical keyword labeling based on the results of our multi-role model has been presented. The evaluation results have shown the feasibility and effectiveness of our model.
Resumo:
SMS (Short Message Service) is now a hugely popular and a very powerful business communication technology for mobile phones. In order to respond correctly to a free form factual question given a large collection of texts, one needs to understand the question at a level that allows determining some of constraints the question imposes on a possible answer. These constraints may include a semantic classification of the sought after answer and may even suggest using different strategies when looking for and verifying a candidate answer. In this paper we focus on various attempts to overcome the major contradiction: the technical limitations of the SMS standard, and the huge number of found information for a possible answer.
Resumo:
Mobile advertising is a rapidly growing sector providing brands and marketing agencies the opportunity to connect with consumers beyond traditional and digital media and instead communicate directly on their mobile phones. Mobile advertising will be intrinsically linked with mobile search, which has transported from the internet to the mobile and is identified as an area of potential growth. The result of mobile searching show that as a general rule such search result exceed 160 characters; the dialog is required to deliver the relevant portion of a response to the mobile user. In this paper we focus initially on mobile search and mobile advert creation, and later the mechanism of interaction between the user’s request, the result of searching, advertising and dialog.