2 resultados para Deverbal nouns
em University of Queensland eSpace - Australia
Resumo:
In this paper we explore the use of text-mining methods for the identification of the author of a text. We apply the support vector machine (SVM) to this problem, as it is able to cope with half a million of inputs it requires no feature selection and can process the frequency vector of all words of a text. We performed a number of experiments with texts from a German newspaper. With nearly perfect reliability the SVM was able to reject other authors and detected the target author in 60–80% of the cases. In a second experiment, we ignored nouns, verbs and adjectives and replaced them by grammatical tags and bigrams. This resulted in slightly reduced performance. Author detection with SVMs on full word forms was remarkably robust even if the author wrote about different topics.
Resumo:
Explanations of the difficulty of relative-clause sentences implicate complexity but the measurement of complexity remains controversial. Four experiments investigated how far relational complexity (RC) theory, that has been found valid for cognitive development and human reasoning, accounts for the difficulty of 16 types of English, object- and subject-extracted relative-clause constructions. RC corresponds to the number of nouns assigned to thematic roles in the same decision. Complexity estimates based on RC and those based on maximal integration cost (MIC) were strongly correlated and accounted for similar variance in sentence difficulty (subjective ratings, comprehension accuracy, reading times). Consistent with RC theory, sentences that required more than 4 role assignments in the same decision were extremely difficult for many participants. Performance on nonlinguistic relational tasks predicted comprehension of object-extracted sentences, before and after controlling for subject-extractions. Working memory tasks predicted comprehension of object-extractions before controlling for subjectextractions. The studies extend the RC approach to a linguistic domain.