The English language and the Internet, both separately and taken together, are nowadays well-acknowledged as powerful forces which influence and affect the lexico-grammatical characteristics of other languages world-wide. In fact, many authors like Crystal (2004) have pointed out the emergence of the so-called Netspeak, that is, the language used in the Net or World Wide Web; as Crystal himself (2004: 19) puts it, ‘a type of language displaying features that are unique to the Internet […] arising out of its character as a medium which is electronic, global and interactive’. This ‘language’, however, may be differently understood: either as an adaptation of the English language proper to internet requirements and purposes, or as a new and rapidly-changing and developing language as a result of a rapid evolution or adaptation to Internet requirements of almost all world languages, for whom English is a trendsetter. If the second and probably most plausible interpretation is adopted, there are three salient features of ‘Netspeak’: (a) the rapid expansion of all its new linguistic developments thanks to the Internet itself, which may lead to the generalization and widespread acceptance of new words, coinages, or meanings, hundreds of times faster than was the case with the printed media. As said above, (b) the visible influence of English, the most prevalent language on the Internet. Consequently, (c) this new language tends to reduce the ‘distance’ between English and other languages as well as the ignorance of the former by speakers of other languages, since the ‘Netspeak’ version of the latter adopts grammatical, syntactic and lexical features of English. Thus, linguistic differences may even disappear when code-switching and/or borrowing occurs, as whole fragments of English appear in other language contexts. As a consequence of the new situation, an ideal context appears for interlanguage or multilingual word formation to thrive: puns, blends, compounds and word creativity in general find in the web the ideal place to gain rapid acceptance world-wide, as a result of fashion, coincidence, or sheer merit of the new linguistic proposals.


The great amount of text produced every day in the Web turned it as one of the main sources for obtaining linguistic corpora, that are further analyzed with Natural Language Processing techniques. On a global scale, languages such as Portuguese - official in 9 countries - appear on the Web in several varieties, with lexical, morphological and syntactic (among others) differences. Besides, a unified spelling system for Portuguese has been recently approved, and its implementation process has already started in some countries. However, it will last several years, so different varieties and spelling systems coexist. Since PoS-taggers for Portuguese are specifically built for a particular variety, this work analyzes different training corpora and lexica combinations aimed at building a model with high-precision annotation in several varieties and spelling systems of this language. Moreover, this paper presents different dictionaries of the new orthography (Spelling Agreement) as well as a new freely available testing corpus, containing different varieties and textual typologies.


This paper addresses the problem of the automatic recognition and classification of temporal expressions and events in human language. Efficacy in these tasks is crucial if the broader task of temporal information processing is to be successfully performed. We analyze whether the application of semantic knowledge to these tasks improves the performance of current approaches. We therefore present and evaluate a data-driven approach as part of a system: TIPSem. Our approach uses lexical semantics and semantic roles as additional information to extend classical approaches which are principally based on morphosyntax. The results obtained for English show that semantic knowledge aids in temporal expression and event recognition, achieving an error reduction of 59% and 21%, while in classification the contribution is limited. From the analysis of the results it may be concluded that the application of semantic knowledge leads to more general models and aids in the recognition of temporal entities that are ambiguous at shallower language analysis levels. We also discovered that lexical semantics and semantic roles have complementary advantages, and that it is useful to combine them. Finally, we carried out the same analysis for Spanish. The results obtained show comparable advantages. This supports the hypothesis that applying the proposed semantic knowledge may be useful for different languages.


One of the main challenges to be addressed in text summarization concerns the detection of redundant information. This paper presents a detailed analysis of three methods for achieving such goal. The proposed methods rely on different levels of language analysis: lexical, syntactic and semantic. Moreover, they are also analyzed for detecting relevance in texts. The results show that semantic-based methods are able to detect up to 90% of redundancy, compared to only the 19% of lexical-based ones. This is also reflected in the quality of the generated summaries, obtaining better summaries when employing syntactic- or semantic-based approaches to remove redundancy.


Par leur caractère polyphonique, de nombreux romans contemporains posent des problèmes lexicaux au traducteur en mélangeant lexique standard, argot et termes techniques. La question qui se pose est alors de savoir si les dictionnaires peuvent être utiles au praticien. Nous verrons que pour des raisons théoriques et pratiques, l’aide qu’ils apportent est limitée, un dictionnaire réellement utile devrait changer ses présupposés conceptuels, donc devenir un dictionnaire culturel et adopter une forme électronique.


El artículo presenta una investigación en la que se analizan, desde una perspectiva lexicométrica y factorial, los aspectos lingüísticos y paralingüísticos más relevantes de la escritura digital síncrona del adolescente español, en uno de los programas de mensajería instantánea más utilizados en la actualidad (WhatsApp©). La escritura en soportes digitales móviles (smartphones y tabletas) es una de las actividades más realizadas en nuestra sociedad y constituye un componente esencial de la competencia comunicativa en la Sociedad de la Información. La comunicación digital forma parte de nuestras vidas y el análisis del uso comunicativo digital y ubicuo con dispositivos y programas tiene amplias repercusiones sociales, lingüísticas y pedagógicas. La investigación se ha contextualizado en una muestra de 417 conversaciones de WhatsApp de estudiantes de enseñanza secundaria, de entre 13 y 16 años, en cuatro provincias españolas. La metodología de investigación ha sido de corte cuantitativa para abordar el análisis lexicométrico del corpus lingüístico-digital con referencia a los elementos lingüísticos y paralingüísticos más relevantes; para, posteriormente, realizar el análisis de las correlaciones entre diferentes variables independientes que expliquen patrones lingüísticos y de uso en la escritura digital. Los resultados muestran que la escritura digital en este tipo de programas tiene una serie de características específicas ortotipográficas y audiovisuales condicionadas por variables de uso, el tamaño de la pantalla del dispositivo, la horas de conversación y la relación establecida entre los interlocutores.


Aquest article presenta una mostra dels resultats de l’anàlisi detallada de locucions, col·locacions i altres elements fraseològics i d’ordre de mots significatius quant a la caracterització del cabal de llenguatge literari de Joan Roís de Corella. Aquesta anàlisi es fa amb metodologia interdisciplinar de base de lingüistica de corpus i de diacronia lingüistica, i amb el concurs de les tecnologies de la informació i la comunicació (humanitats digitals), que s’apliquen a l’anàlisi de l’aportació lèxica i estilística d’un autor clau com és Roís de Corella a fide calibrar el grau de sintonia i, alhora, d’especificitat del seu llenguatge literari; en quin grau coincideix el seu llenguatge literari amb el d’altres grans clàssics culturals de la Corona d’Aragó, i en què basa, alhora, Roís de Corella la clau de la seua mestria estilística.


As has been the case with other European languages, Spanish has welcomed the arrival of English words, in spite of all purist efforts to the contrary. Moreover, it has not only adopted and adapted true Anglicisms but it has also created other forms based on English patterns, such mechanisms particularly visible in the fashion jargon in Spanish. In this paper we focus on -ing forms in the Spanish language of fashion, which may at times be genuine Anglicisms (formal or semantic ones) or false Anglicisms (analogical creations, that is, English-looking lexical elements), found in Spanish editions of fashion magazines such as Vogue, Elle, InStyle, Grazia, Glamour, and Cosmopolitan. The main aim of this study is to qualitatively analyse and classify -ing Anglicisms and false Anglicisms in the aforementioned jargon in order to establish whether the impact of English in the Spanish fashion jargon is so important as to replace native words and expressions.


The goals of this article are to summarize the problems and solutions found in translating seven Health-Related Quality of Life (HRQOL) questionnaires from English into Spanish which have used a common international protocol based on back-translation techniques. The methodology used is based on the linguistic validation model including both the linguistic and the sociopragmatic equivalence. Five questionnaires from seven have obtained good results, not so two of them. Considering linguistic questions, there were more problems than good solutions on the lexical-semantic level. With respect to the sociocultural questions, there were more solutions than problems. The Spanish translated questionnaires still present deficiencies to be corrected, so both linguistic and sociocultural questions have to be studied more carefully in order not to allow differences between the translated versions and the source questionnaires.


The geographical proximity and socioeconomic dependence on the United States brought about a deep rooted anglicization of the Cuban Spanish lexis and social strata, especially throughout the Neocolonial period (1902–1959). This study is based on the revision of a renowned newspaper of that time, Diario de la Marina, and the corresponding elaboration of a corpus of English-induced loanwords. Diario de la Marina particularly targeted upper social class, and only crónicas sociales (society pages’ columns) and print advertising were revised because of their fully descriptive texts, which encoded the ruling class ideology and consumerism. The findings show that there existed a high number of lexical and cultural anglicisms in the sociolect in question, and that the sociolinguistic anglicization was openly embraced by the upper socioeconomic stratum, entailing a differentiating sign of sophistication and social stratification. Likewise, a number of the anglicisms collected, particularly those related with social events, are unused in contemporary Cuban Spanish, which suggests a major semantic shifting in this sociolect after 1959.


O latim das inscrições romanas no território português não foi até à data alvo de um estudo individualizado. A única obra na qual o assunto foi tratado analisa o latim de toda a Península Ibérica e foi publicada há pouco mais de cem anos, estando desactualizada. As inscrições romanas do território português estão publicadas em diferentes obras. O Corpus Inscriptionum Latinarum continua a ser uma referência fundamental, mas, ao longo do século XX, foram publicados novos estudos, que actualizam leituras ou divulgam novas epígrafes. Desta forma, para caracterizar o latim das inscrições romanas no território português, é necessário constituir um corpus que inevitavelmente terá de incluir epígrafes provenientes de diversas publicações. A análise do latim das inscrições compreende aspectos fonéticos, morfológicos, sintácticos e lexicais. São seleccionados apenas aspectos relevantes para o estudo do texto epigráfico. O tratamento de cada um dos aspectos está dividido numa componente teórica, na qual se faz um balanço das conclusões da literatura científica, e numa componente prática, na qual se relatam os dados das inscrições do território português. O latim das inscrições do território português pode ser caracterizado como conservador, predominando nele o respeito pela correcção da língua. Para esta caracterização conservadora, contribui a presença de arcaísmos nas desinências nominais e verbais, alguns no século II. Por outro lado, não deixa de manifestar, à semelhança do latim de outras regiões, nomeadamente de Pompeios, algumas particularidades inovadoras, como a monotongação do ditongo ae ou a oscilação na grafia das vogais. Além destes aspectos, são ainda perceptíveis ténues diferenças internas, visto que há fenómenos documentados apenas em algumas regiões.


As cantigas da lírica galego-portuguesa são obras de um conjunto diversificado de autores e constituem-se como um rico património literário e cultural da Idade Média, produzido entre os séculos XII e XIV. Ao longo dos tempos, o seu interesse tem conduzido ao estudo de aspetos da transmissão dos textos, da biografia dos trovadores e das influências recebidas de territórios além peninsulares, bem como tem levado à concretização de diversas edições críticas. A presente tese tem como objetivo a edição crítica das cantigas de um dos trovadores da lírica galego-portuguesa, Airas Engeitado. Este autor foi editado pela última vez em 1932, por José Joaquim Nunes, juntamente com as cantigas de amor que Carolina Michaëlis considerou excluídas do cancioneiro da Ajuda. Esta edição não foi, até à data e que seja do nosso conhecimento, revista por nenhum editor. A edição de Nunes, sobre a qual o próprio Nunes manifestou dúvidas, apresenta os textos de Engeitado bastante deturpados, pelo que se procede aqui a uma nova edição crítica, com critérios de edição mais exigentes que os de Nunes na edição referida e normas de transcrição diferentes. Procede-se também ao enquadramento e explicação de uma lírica de autor com caraterísticas que podemos considerar singulares, no contexto da lírica galego-portuguesa. As quatro cantigas de amor que considero da autoria de Airas Engeitado chegaram até nós pelo Cancioneiro da Biblioteca Nacional (B), pelo Cancioneiro da Biblioteca Vaticana (V) e foram mencionadas na Tavola Colocciana, índice de B. Além do estabelecimento crítico das cantigas de Airas Engeitado, fazem-se diversos apontamentos sobre questões paleográficas, notas que abordam as divergências existentes entre as minhas leituras dos testemunhos e as leituras do editor anterior, bem como notas que remetem para peculiaridades lexicais, sintáticas ou dos esquemas de versificação das cantigas. Em breve capítulo, resume-se o pouco que se sabe sobre a biografia de Airas Engeitado e faz-se o enquadramento das cantigas editadas na tradição manuscrita. Questão de extrema relevância é a da dupla atribuição da cantiga A gran direito lazerei, que equaciono e discuto também no capítulo sobre a tradição manuscrita. É nesta reflexão que fundamento a minha decisão de incluir a cantiga na presente edição, apesar de ela ter sido, até à data, unanimemente atribuída a Afonso Eanes do Coton.


Home literacy environment explains between 12 and 18.5% of the variance of children’s language skills. Although most authors agree that children whose parents encourage them to read tend to develop better and earlier reading skills, some authors consider that the impact of family environment in reading skills is overvalued. Probably, other variables of parent–child relationship, like parenting styles, might be relevant for this field. Nevertheless, no previous studies on the effect of parenting styles in literacy have been found. To analyze the role of parenting styles in the reading processes of children. Children’s perceptions of parenting styles contribute significantly to the explanation of statistical variance of children’s reading processes. 110 children (67 boys and 43 girls), aged between 7 and 11 years (M=9.22 and SD = 1.14) from Portuguese schools answered to a socio-demographic questionnaire. To assess reading processes it was administered the Portuguese adaptation (Figueira et al. in press) of Bateria de Avaliação dos Processos Leitores-Revista (PROLEC-R). To assess the parenting styles Egna Minnen av Barndoms Uppfostran-parents (EMBU-P) and EMBU-C (children version) were administered. According to multiple hierarchical linear regressions, individual factors contribute to explain all reading tests of PROLEC-R, while family factors contribute to explain most of these tests. Regarding parenting styles, results evidence the explanatory power about grammatical structures, sentence comprehension and listening. Parenting styles have an important role in the explanation of higher reading processes (syntactic and semantic) but not in lexical processes, focused by main theories concerning dyslexia.