3 resultados para answer errors

em Helda - Digital Repository of University of Helsinki


Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this thesis we present and evaluate two pattern matching based methods for answer extraction in textual question answering systems. A textual question answering system is a system that seeks answers to natural language questions from unstructured text. Textual question answering systems are an important research problem because as the amount of natural language text in digital format grows all the time, the need for novel methods for pinpointing important knowledge from the vast textual databases becomes more and more urgent. We concentrate on developing methods for the automatic creation of answer extraction patterns. A new type of extraction pattern is developed also. The pattern matching based approach chosen is interesting because of its language and application independence. The answer extraction methods are developed in the framework of our own question answering system. Publicly available datasets in English are used as training and evaluation data for the methods. The techniques developed are based on the well known methods of sequence alignment and hierarchical clustering. The similarity metric used is based on edit distance. The main conclusions of the research are that answer extraction patterns consisting of the most important words of the question and of the following information extracted from the answer context: plain words, part-of-speech tags, punctuation marks and capitalization patterns, can be used in the answer extraction module of a question answering system. This type of patterns and the two new methods for generating answer extraction patterns provide average results when compared to those produced by other systems using the same dataset. However, most answer extraction methods in the question answering systems tested with the same dataset are both hand crafted and based on a system-specific and fine-grained question classification. The the new methods developed in this thesis require no manual creation of answer extraction patterns. As a source of knowledge, they require a dataset of sample questions and answers, as well as a set of text documents that contain answers to most of the questions. The question classification used in the training data is a standard one and provided already in the publicly available data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis studies empirically whether measurement errors in aggregate production statistics affect sentiment and future output. Initial announcements of aggregate production are subject to measurement error, because many of the data required to compile the statistics are produced with a lag. This measurement error can be gauged as the difference between the latest revised statistic and its initial announcement. Assuming aggregate production statistics help forecast future aggregate production, these measurement errors are expected to affect macroeconomic forecasts. Assuming agents’ macroeconomic forecasts affect their production choices, these measurement errors should affect future output through sentiment. This thesis is primarily empirical, so the theoretical basis, strategic complementarity, is discussed quite briefly. However, it is a model in which higher aggregate production increases each agent’s incentive to produce. In this circumstance a statistical announcement which suggests aggregate production is high would increase each agent’s incentive to produce, thus resulting in higher aggregate production. In this way the existence of strategic complementarity provides the theoretical basis for output fluctuations caused by measurement mistakes in aggregate production statistics. Previous empirical studies suggest that measurement errors in gross national product affect future aggregate production in the United States. Additionally it has been demonstrated that measurement errors in the Index of Leading Indicators affect forecasts by professional economists as well as future industrial production in the United States. This thesis aims to verify the applicability of these findings to other countries, as well as study the link between measurement errors in gross domestic product and sentiment. This thesis explores the relationship between measurement errors in gross domestic production and sentiment and future output. Professional forecasts and consumer sentiment in the United States and Finland, as well as producer sentiment in Finland, are used as the measures of sentiment. Using statistical techniques it is found that measurement errors in gross domestic product affect forecasts and producer sentiment. The effect on consumer sentiment is ambiguous. The relationship between measurement errors and future output is explored using data from Finland, United States, United Kingdom, New Zealand and Sweden. It is found that measurement errors have affected aggregate production or investment in Finland, United States, United Kingdom and Sweden. Specifically, it was found that overly optimistic statistics announcements are associated with higher output and vice versa.