2 resultados para Portuguese language Grammaticalization
em Universidad de Alicante
Resumo:
The great amount of text produced every day in the Web turned it as one of the main sources for obtaining linguistic corpora, that are further analyzed with Natural Language Processing techniques. On a global scale, languages such as Portuguese - official in 9 countries - appear on the Web in several varieties, with lexical, morphological and syntactic (among others) differences. Besides, a unified spelling system for Portuguese has been recently approved, and its implementation process has already started in some countries. However, it will last several years, so different varieties and spelling systems coexist. Since PoS-taggers for Portuguese are specifically built for a particular variety, this work analyzes different training corpora and lexica combinations aimed at building a model with high-precision annotation in several varieties and spelling systems of this language. Moreover, this paper presents different dictionaries of the new orthography (Spelling Agreement) as well as a new freely available testing corpus, containing different varieties and textual typologies.
Resumo:
The reprise evidential conditional (REC) is nowadays not very usual in Catalan: it is restricted to journalistic language and to some very formal genres (such as academic or legal language), it is not present in spontaneous discourse. On the one hand, it has been described among the rather new modality values of the conditional. On the other, the normative tradition tended to reject it for being a gallicism, or to describe it as an unsuitable neologism. Thanks to the extraction from text corpora, we surprisingly find this REC in Catalan from the beginning of the fourteenth century to the contemporary age, with semantic and pragmatic nuances and different evidence of grammaticalization. Due to the current interest in evidentiality, the REC has been widely studied in French, Italian and Portuguese, focusing mainly on its contemporary uses and not so intensively on the diachronic process that could explain the origin of this value. In line with this research, that we initiated studying the epistemic and evidential future in Catalan, our aim is to describe: a) the pragmatic context that could have been the initial point of the REC in the thirteenth century, before we find indisputable attestations of this use; b) the path of semantic change followed by the conditional from a ‘future in the past’ tense to the acquisition of epistemic and evidential values; and c) the role played by invited inferences, subjectification and intersubjectification in this change.