3 resultados para Gramática Textual-Interativa
em Helda - Digital Repository of University of Helsinki
Resumo:
In this thesis we present and evaluate two pattern matching based methods for answer extraction in textual question answering systems. A textual question answering system is a system that seeks answers to natural language questions from unstructured text. Textual question answering systems are an important research problem because as the amount of natural language text in digital format grows all the time, the need for novel methods for pinpointing important knowledge from the vast textual databases becomes more and more urgent. We concentrate on developing methods for the automatic creation of answer extraction patterns. A new type of extraction pattern is developed also. The pattern matching based approach chosen is interesting because of its language and application independence. The answer extraction methods are developed in the framework of our own question answering system. Publicly available datasets in English are used as training and evaluation data for the methods. The techniques developed are based on the well known methods of sequence alignment and hierarchical clustering. The similarity metric used is based on edit distance. The main conclusions of the research are that answer extraction patterns consisting of the most important words of the question and of the following information extracted from the answer context: plain words, part-of-speech tags, punctuation marks and capitalization patterns, can be used in the answer extraction module of a question answering system. This type of patterns and the two new methods for generating answer extraction patterns provide average results when compared to those produced by other systems using the same dataset. However, most answer extraction methods in the question answering systems tested with the same dataset are both hand crafted and based on a system-specific and fine-grained question classification. The the new methods developed in this thesis require no manual creation of answer extraction patterns. As a source of knowledge, they require a dataset of sample questions and answers, as well as a set of text documents that contain answers to most of the questions. The question classification used in the training data is a standard one and provided already in the publicly available data.
Resumo:
This dissertation is a descriptive grammar of Ternate Chabacano, a Spanish-lexifier Creole spoken by 3.000 people in the town of Ternate, Philippines. The dissertation offers an analysis of the phonological, morphological, and syntactic system of the language. It includes an overview of the historical background, the current situation of the speech community and a collection of annotated texts. Ternate Chabacano shares many characteristics with its main adstrate language Tagalog as well as the dialectal varieties of Spanish. At present, English also exerts an influence, nevertheless mainly affecting its lexicon. The description offered is based on fieldwork conducted in Ternate. Spoken language collected through thematic interviews forms the main type of the material analysed. Information regarding the informants and text types is included in the examples. Ternate Chabacano has a five-vowel system and 17 consonant phonemes. The morphology of the language is largely isolating. Clitics are used extensively for expressing adverbial relations. The verbal system is based on the preverbal markers that express the category of tense, modality and aspect, among which aspect is the main dimension. Complex predicates and verbal chains are used in order to further distinguish aspect and modality, as well as changes of voice and valency. Intransitive verbs express motion, states, and reflexive actions, even though the majority of verbs can occur in both intransitive and transitive clauses. Ternate Chabacano is a nominative-accusative type language but the typological configuration of the Philippine languages influences the marking of its constituents. A case in point is constituted by the nominal determination system. The basic constituent order in a clause is VSO. Equative and attibutive clauses are formed by juxtaposition while the locative clauses feature a copula. Indefinite terms are expressed through existential constructions. The negation of existential clauses differs from standard negation but both are intensified in the same way. In spoken discourse, tag-questions are common. Pragmatic elements and social formulas reflect largely the corresponding Tagalog expressions. Coordination and subordination occur typically without overt markers but a variety of markers exists for expressing different relations, especially those made explicit by adverbial clauses. Verbal chains form a continuum from serial verbs to complementation and ultimately to coordination.