23 resultados para corpora allata
Resumo:
Tema 5. Anotación de corpus literario. XML. El estándar TEI.
Resumo:
Guía rápida de análisis de corpus (con AntConc).
Resumo:
This paper proposes a new feature representation method based on the construction of a Confidence Matrix (CM). This representation consists of posterior probability values provided by several weak classifiers, each one trained and used in different sets of features from the original sample. The CM allows the final classifier to abstract itself from discovering underlying groups of features. In this work the CM is applied to isolated character image recognition, for which several set of features can be extracted from each sample. Experimentation has shown that the use of CM permits a significant improvement in accuracy in most cases, while the others remain the same. The results were obtained after experimenting with four well-known corpora, using evolved meta-classifiers with the k-Nearest Neighbor rule as a weak classifier and by applying statistical significance tests.
Resumo:
Aquest article presenta una mostra dels resultats de l’anàlisi detallada de locucions, col·locacions i altres elements fraseològics i d’ordre de mots significatius quant a la caracterització del cabal de llenguatge literari de Joan Roís de Corella. Aquesta anàlisi es fa amb metodologia interdisciplinar de base de lingüistica de corpus i de diacronia lingüistica, i amb el concurs de les tecnologies de la informació i la comunicació (humanitats digitals), que s’apliquen a l’anàlisi de l’aportació lèxica i estilística d’un autor clau com és Roís de Corella a fide calibrar el grau de sintonia i, alhora, d’especificitat del seu llenguatge literari; en quin grau coincideix el seu llenguatge literari amb el d’altres grans clàssics culturals de la Corona d’Aragó, i en què basa, alhora, Roís de Corella la clau de la seua mestria estilística.
Resumo:
L’estudi de la neologia és indestriable de l’estudi del canvi lingüístic i, doncs, de la diacronia. Ens proposem ací descriure el procés de canvi semàntic que va experimentar el verb esmar, forma patrimonial del llatí *adaestimare, paral·lela del cultisme estimar. Aquesta recerca es fonamenta en l’aprofitament dels corpus textuals i altres materials despullats manualment. Sobre aquests materials, s’ha assajat l’anàlisi de la subjectivació i de les inferències que proposa la teoria de la inferència invitada del canvi semàntic (= TIICS).
Resumo:
The reprise evidential conditional (REC) is nowadays not very usual in Catalan: it is restricted to journalistic language and to some very formal genres (such as academic or legal language), it is not present in spontaneous discourse. On the one hand, it has been described among the rather new modality values of the conditional. On the other, the normative tradition tended to reject it for being a gallicism, or to describe it as an unsuitable neologism. Thanks to the extraction from text corpora, we surprisingly find this REC in Catalan from the beginning of the fourteenth century to the contemporary age, with semantic and pragmatic nuances and different evidence of grammaticalization. Due to the current interest in evidentiality, the REC has been widely studied in French, Italian and Portuguese, focusing mainly on its contemporary uses and not so intensively on the diachronic process that could explain the origin of this value. In line with this research, that we initiated studying the epistemic and evidential future in Catalan, our aim is to describe: a) the pragmatic context that could have been the initial point of the REC in the thirteenth century, before we find indisputable attestations of this use; b) the path of semantic change followed by the conditional from a ‘future in the past’ tense to the acquisition of epistemic and evidential values; and c) the role played by invited inferences, subjectification and intersubjectification in this change.
Resumo:
It is almost 20 years since a series of conferences known as CULT (Corpus Use and Learning to Translate) started. The first and second took place in Bertinoro, Italy, back in 1997 and 2000, respectively. The third was held in 2004 in Barcelona, and the fourth in 2015 in Alicante. Each was organized by a few enthusiastic lecturers and scholars who also happened to be corpus lovers. Guy Aston, Silvia Bernardini, Dominic Stewart and Federico Zanettin, from the Universitá di Bologna; Allison Beeby, Patricia Rodríguez-Inés and Pilar Sánchez-Gijón, from the Universitat Autònoma de Barcelona; and Daniel Gallego-Hernández, from the Universidad de Alicante, organized CULT conferences in the belief that spreading the word about the usefulness of corpora for teaching and professional translation purposes would have positive results.
Resumo:
Statistical machine translation (SMT) is an approach to Machine Translation (MT) that uses statistical models whose parameter estimation is based on the analysis of existing human translations (contained in bilingual corpora). From a translation student’s standpoint, this dissertation aims to explain how a phrase-based SMT system works, to determine the role of the statistical models it uses in the translation process and to assess the quality of the translations provided that system is trained with in-domain goodquality corpora. To that end, a phrase-based SMT system based on Moses has been trained and subsequently used for the English to Spanish translation of two texts related in topic to the training data. Finally, the quality of this output texts produced by the system has been assessed through a quantitative evaluation carried out with three different automatic evaluation measures and a qualitative evaluation based on the Multidimensional Quality Metrics (MQM).