979 resultados para lexical semantics
Resumo:
The great amount of text produced every day in the Web turned it as one of the main sources for obtaining linguistic corpora, that are further analyzed with Natural Language Processing techniques. On a global scale, languages such as Portuguese - official in 9 countries - appear on the Web in several varieties, with lexical, morphological and syntactic (among others) differences. Besides, a unified spelling system for Portuguese has been recently approved, and its implementation process has already started in some countries. However, it will last several years, so different varieties and spelling systems coexist. Since PoS-taggers for Portuguese are specifically built for a particular variety, this work analyzes different training corpora and lexica combinations aimed at building a model with high-precision annotation in several varieties and spelling systems of this language. Moreover, this paper presents different dictionaries of the new orthography (Spelling Agreement) as well as a new freely available testing corpus, containing different varieties and textual typologies.
Resumo:
One of the main challenges to be addressed in text summarization concerns the detection of redundant information. This paper presents a detailed analysis of three methods for achieving such goal. The proposed methods rely on different levels of language analysis: lexical, syntactic and semantic. Moreover, they are also analyzed for detecting relevance in texts. The results show that semantic-based methods are able to detect up to 90% of redundancy, compared to only the 19% of lexical-based ones. This is also reflected in the quality of the generated summaries, obtaining better summaries when employing syntactic- or semantic-based approaches to remove redundancy.
Resumo:
Par leur caractère polyphonique, de nombreux romans contemporains posent des problèmes lexicaux au traducteur en mélangeant lexique standard, argot et termes techniques. La question qui se pose est alors de savoir si les dictionnaires peuvent être utiles au praticien. Nous verrons que pour des raisons théoriques et pratiques, l’aide qu’ils apportent est limitée, un dictionnaire réellement utile devrait changer ses présupposés conceptuels, donc devenir un dictionnaire culturel et adopter une forme électronique.
Resumo:
El artículo presenta una investigación en la que se analizan, desde una perspectiva lexicométrica y factorial, los aspectos lingüísticos y paralingüísticos más relevantes de la escritura digital síncrona del adolescente español, en uno de los programas de mensajería instantánea más utilizados en la actualidad (WhatsApp©). La escritura en soportes digitales móviles (smartphones y tabletas) es una de las actividades más realizadas en nuestra sociedad y constituye un componente esencial de la competencia comunicativa en la Sociedad de la Información. La comunicación digital forma parte de nuestras vidas y el análisis del uso comunicativo digital y ubicuo con dispositivos y programas tiene amplias repercusiones sociales, lingüísticas y pedagógicas. La investigación se ha contextualizado en una muestra de 417 conversaciones de WhatsApp de estudiantes de enseñanza secundaria, de entre 13 y 16 años, en cuatro provincias españolas. La metodología de investigación ha sido de corte cuantitativa para abordar el análisis lexicométrico del corpus lingüístico-digital con referencia a los elementos lingüísticos y paralingüísticos más relevantes; para, posteriormente, realizar el análisis de las correlaciones entre diferentes variables independientes que expliquen patrones lingüísticos y de uso en la escritura digital. Los resultados muestran que la escritura digital en este tipo de programas tiene una serie de características específicas ortotipográficas y audiovisuales condicionadas por variables de uso, el tamaño de la pantalla del dispositivo, la horas de conversación y la relación establecida entre los interlocutores.
Resumo:
Aquest article presenta una mostra dels resultats de l’anàlisi detallada de locucions, col·locacions i altres elements fraseològics i d’ordre de mots significatius quant a la caracterització del cabal de llenguatge literari de Joan Roís de Corella. Aquesta anàlisi es fa amb metodologia interdisciplinar de base de lingüistica de corpus i de diacronia lingüistica, i amb el concurs de les tecnologies de la informació i la comunicació (humanitats digitals), que s’apliquen a l’anàlisi de l’aportació lèxica i estilística d’un autor clau com és Roís de Corella a fide calibrar el grau de sintonia i, alhora, d’especificitat del seu llenguatge literari; en quin grau coincideix el seu llenguatge literari amb el d’altres grans clàssics culturals de la Corona d’Aragó, i en què basa, alhora, Roís de Corella la clau de la seua mestria estilística.
Resumo:
As has been the case with other European languages, Spanish has welcomed the arrival of English words, in spite of all purist efforts to the contrary. Moreover, it has not only adopted and adapted true Anglicisms but it has also created other forms based on English patterns, such mechanisms particularly visible in the fashion jargon in Spanish. In this paper we focus on -ing forms in the Spanish language of fashion, which may at times be genuine Anglicisms (formal or semantic ones) or false Anglicisms (analogical creations, that is, English-looking lexical elements), found in Spanish editions of fashion magazines such as Vogue, Elle, InStyle, Grazia, Glamour, and Cosmopolitan. The main aim of this study is to qualitatively analyse and classify -ing Anglicisms and false Anglicisms in the aforementioned jargon in order to establish whether the impact of English in the Spanish fashion jargon is so important as to replace native words and expressions.
Resumo:
The goals of this article are to summarize the problems and solutions found in translating seven Health-Related Quality of Life (HRQOL) questionnaires from English into Spanish which have used a common international protocol based on back-translation techniques. The methodology used is based on the linguistic validation model including both the linguistic and the sociopragmatic equivalence. Five questionnaires from seven have obtained good results, not so two of them. Considering linguistic questions, there were more problems than good solutions on the lexical-semantic level. With respect to the sociocultural questions, there were more solutions than problems. The Spanish translated questionnaires still present deficiencies to be corrected, so both linguistic and sociocultural questions have to be studied more carefully in order not to allow differences between the translated versions and the source questionnaires.
Resumo:
The geographical proximity and socioeconomic dependence on the United States brought about a deep rooted anglicization of the Cuban Spanish lexis and social strata, especially throughout the Neocolonial period (1902–1959). This study is based on the revision of a renowned newspaper of that time, Diario de la Marina, and the corresponding elaboration of a corpus of English-induced loanwords. Diario de la Marina particularly targeted upper social class, and only crónicas sociales (society pages’ columns) and print advertising were revised because of their fully descriptive texts, which encoded the ruling class ideology and consumerism. The findings show that there existed a high number of lexical and cultural anglicisms in the sociolect in question, and that the sociolinguistic anglicization was openly embraced by the upper socioeconomic stratum, entailing a differentiating sign of sophistication and social stratification. Likewise, a number of the anglicisms collected, particularly those related with social events, are unused in contemporary Cuban Spanish, which suggests a major semantic shifting in this sociolect after 1959.
Resumo:
Abstract Imprecise manipulation of source code (semi-parsing) is useful for tasks such as robust parsing, error recovery, lexical analysis, and rapid development of parsers for data extraction. An island grammar precisely defines only a subset of a language syntax (islands), while the rest of the syntax (water) is defined imprecisely. Usually water is defined as the negation of islands. Albeit simple, such a definition of water is naive and impedes composition of islands. When developing an island grammar, sooner or later a language engineer has to create water tailored to each individual island. Such an approach is fragile, because water can change with any change of a grammar. It is time-consuming, because water is defined manually by an engineer and not automatically. Finally, an island surrounded by water cannot be reused because water has to be defined for every grammar individually. In this paper we propose a new technique of island parsing —- bounded seas. Bounded seas are composable, robust, reusable and easy to use because island-specific water is created automatically. Our work focuses on applications of island parsing to data extraction from source code. We have integrated bounded seas into a parser combinator framework as a demonstration of their composability and reusability.
Resumo:
Reading strategies vary across languages according to orthographic depth - the complexity of the grapheme in relation to phoneme conversion rules - notably at the level of eye movement patterns. We recently demonstrated that a group of early bilinguals, who learned both languages equally under the age of seven, presented a first fixation location (FFL) closer to the beginning of words when reading in German as compared with French. Since German is known to be orthographically more transparent than French, this suggested that different strategies were being engaged depending on the orthographic depth of the used language. Opaque languages induce a global reading strategy, and transparent languages force a local/serial strategy. Thus, pseudo-words were processed using a local strategy in both languages, suggesting that the link between word forms and their lexical representation may also play a role in selecting a specific strategy. In order to test whether corresponding effects appear in late bilinguals with low proficiency in their second language (L2), we present a new study in which we recorded eye movements while two groups of late German-French and French-German bilinguals read aloud isolated French and German words and pseudo-words. Since, a transparent reading strategy is local and serial, with a high number of fixations per stimuli, and the level of the bilingual participants' L2 is low, the impact of language opacity should be observed in L1. We therefore predicted a global reading strategy if the bilinguals' L1 was French (FFL close to the middle of the stimuli with fewer fixations per stimuli) and a local and serial reading strategy if it was German. Thus, the L2 of each group, as well as pseudo-words, should also require a local and serial reading strategy. Our results confirmed these hypotheses, suggesting that global word processing is only achieved by bilinguals with an opaque L1 when reading in an opaque language; the low level in the L2 gives way to a local and serial reading strategy. These findings stress the fact that reading behavior is influenced not only by the linguistic mode but also by top-down factors, such as readers' proficiency.
Resumo:
"All the tablets here published form part of the Nippur collections now in the University Museum of the University of Pennsylvania."--Pref.
Resumo:
"Limited edition for experimental use by teachers and study group leaders."
Resumo:
"Contract no. E(04-3)-34, PA 214."
Resumo:
"Supported in part by the Advanced Research Projects Agency ... under Contract no. US AF 30(602) 4144."
Resumo:
"UILU-ENG 77 1766."