10 resultados para Variação lexical
em Aston University Research Archive
Resumo:
Derivational morphology proposes meaningful connections between words and is largely unrepresented in lexical databases. This thesis presents a project to enrich a lexical database with morphological links and to evaluate their contribution to disambiguation. A lexical database with sense distinctions was required. WordNet was chosen because of its free availability and widespread use. Its suitability was assessed through critical evaluation with respect to specifications and criticisms, using a transparent, extensible model. The identification of serious shortcomings suggested a portable enrichment methodology, applicable to alternative resources. Although 40% of the most frequent words are prepositions, they have been largely ignored by computational linguists, so addition of prepositions was also required. The preferred approach to morphological enrichment was to infer relations from phenomena discovered algorithmically. Both existing databases and existing algorithms can capture regular morphological relations, but cannot capture exceptions correctly; neither of them provide any semantic information. Some morphological analysis algorithms are subject to the fallacy that morphological analysis can be performed simply by segmentation. Morphological rules, grounded in observation and etymology, govern associations between and attachment of suffixes and contribute to defining the meaning of morphological relationships. Specifying character substitutions circumvents the segmentation fallacy. Morphological rules are prone to undergeneration, minimised through a variable lexical validity requirement, and overgeneration, minimised by rule reformulation and restricting monosyllabic output. Rules take into account the morphology of ancestor languages through co-occurrences of morphological patterns. Multiple rules applicable to an input suffix need their precedence established. The resistance of prefixations to segmentation has been addressed by identifying linking vowel exceptions and irregular prefixes. The automatic affix discovery algorithm applies heuristics to identify meaningful affixes and is combined with morphological rules into a hybrid model, fed only with empirical data, collected without supervision. Further algorithms apply the rules optimally to automatically pre-identified suffixes and break words into their component morphemes. To handle exceptions, stoplists were created in response to initial errors and fed back into the model through iterative development, leading to 100% precision, contestable only on lexicographic criteria. Stoplist length is minimised by special treatment of monosyllables and reformulation of rules. 96% of words and phrases are analysed. 218,802 directed derivational links have been encoded in the lexicon rather than the wordnet component of the model because the lexicon provides the optimal clustering of word senses. Both links and analyser are portable to an alternative lexicon. The evaluation uses the extended gloss overlaps disambiguation algorithm. The enriched model outperformed WordNet in terms of recall without loss of precision. Failure of all experiments to outperform disambiguation by frequency reflects on WordNet sense distinctions.
Resumo:
In a group of adult dyslexics word reading and, especially, word spelling are predicted more by what we have called lexical learning (tapped by a paired-associate task with pictures and written nonwords) than by phonological skills. Nonword reading and spelling, instead, are not associated with this task but they are predicted by phonological tasks. Consistently, surface and phonological dyslexics show opposite profiles on lexical learning and phonological tasks. The phonological dyslexics are more impaired on the phonological tasks, while the surface dyslexics are equally or more impaired on the lexical learning tasks. Finally, orthographic lexical learning explains more variation in spelling than in reading, and subtyping based on spelling returns more interpretable results than that based on reading. These results suggest that the quality of lexical representations is crucial to adult literacy skills. This is best measured by spelling and best predicted by a task of lexical learning. We hypothesize that lexical learning taps a uniquely human capacity to form new representations by recombining the units of a restricted set.
Resumo:
We investigated the ability to learn new words in a group of 22 adults with developmental dyslexia/dysgraphia and the relationship between their learning and spelling problems. We identified a deficit that affected the ability to learn both spoken and written new words (lexical learning deficit). There were no comparable problems in learning other kinds of representations (lexical/semantic and visual) and the deficit could not be explained in terms of more traditional phonological deficits associated with dyslexia (phonological awareness, phonological STM). Written new word learning accounted for further variance in the severity of the dysgraphia after phonological abilities had been partialled out. We suggest that lexical learning may be an independent ability needed to create lexical/formal representations from a series of independent units. Theoretical and clinical implications are discussed. © 2005 Psychology Press Ltd.
Resumo:
This paper is a progress report on a research path I first outlined in my contribution to “Words in Context: A Tribute to John Sinclair on his Retirement” (Heffer and Sauntson, 2000). Therefore, I first summarize that paper here, in order to provide the relevant background. The second half of the current paper consists of some further manual analyses, exploring various parameters and procedures that might assist in the design of an automated computational process for the identification of lexical sets. The automation itself is beyond the scope of the current paper.
Resumo:
Description, on English political texts, difference between lexical direction and direction in context.
Phonological–lexical activation:a lexical component or anoutput buffer? Evidence from aphasic errors
Resumo:
Single word production requires that phoneme activation is maintained while articulatory conversion is taking place. Word serial recall, connected speech and non-word production (repetition and spelling) are all assumed to involve a phonological output buffer. A crucial question is whether the same memory resources are also involved in single word production. We investigate this question by assessing length and positional effects in the single word repetition and reading of six aphasic patients. We expect a damaged buffer to result in error rates per phoneme which increase with word length and in position effects. Although our patients had trouble with phoneme activation (they made mainly errors of phoneme selection), they did not show the effects expected from a buffer impairment. These results show that phoneme activation cannot be automatically equated with a buffer. We hypothesize that the phonemes of existing words are kept active though permanent links to the word node. Thus, the sustained activation needed for their articulation will come from the lexicon and will have different characteristics from the activation needed for the short-term retention of an unbound set of units. We conclude that there is no need and no evidence for a phonological buffer in single word production.
Resumo:
This paper presents a statistical comparison of regional phonetic and lexical variation in American English. Both the phonetic and lexical datasets were first subjected to separate multivariate spatial analyses in order to identify the most common dimensions of spatial clustering in these two datasets. The dimensions of phonetic and lexical variation extracted by these two analyses were then correlated with each other, after being interpolated over a shared set of reference locations, in order to measure the similarity of regional phonetic and lexical variation in American English. This analysis shows that regional phonetic and lexical variation are remarkably similar in Modern American English.
Resumo:
We present three jargonaphasic patients who made phonological errors in naming, repetition and reading. We analyse target/response overlap using statistical models to answer three questions: 1) Is there a single phonological source for errors or two sources, one for target-related errors and a separate source for abstruse errors? 2) Can correct responses be predicted by the same distribution used to predict errors or do they show a completion boost (CB)? 3) Is non-lexical and lexical information summed during reading and repetition? The answers were clear. 1) Abstruse errors did not require a separate distribution created by failure to access word forms. Abstruse and target-related errors were the endpoints of a single overlap distribution. 2) Correct responses required a special factor, e.g., a CB or lexical/phonological feedback, to preserve their integrity. 3) Reading and repetition required separate lexical and non-lexical contributions that were combined at output.
Resumo:
This paper introduces a quantitative method for identifying newly emerging word forms in large time-stamped corpora of natural language and then describes an analysis of lexical emergence in American social media using this method based on a multi-billion word corpus of Tweets collected between October 2013 and November 2014. In total 29 emerging word forms, which represent various semantic classes, grammatical parts-of speech, and word formations processes, were identified through this analysis. These 29 forms are then examined from various perspectives in order to begin to better understand the process of lexical emergence.