47 resultados para Lexicons


Relevância:

10.00% 10.00%

Publicador:

Resumo:

For humans and robots to communicate using natural language it is necessary for the robots to develop concepts and associated terms that correspond to the human use of words. Time and space are foundational concepts in human language, and to develop a set of words that correspond to human notions of time and space, it is necessary to take into account the way that they are used in natural human conversations, where terms and phrases such as `soon', `in a while', or `near' are often used. We present language learning robots called Lingodroids that can learn and use simple terms for time and space. In previous work, the Lingodroids were able to learn terms for space. In this work we extend their abilities by adding temporal variables which allow them to learn terms for time. The robots build their own maps of the world and interact socially to form a shared lexicon for location and duration terms. The robots successfully use the shared lexicons to communicate places and times to meet again.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The aim was to analyse the growth and compositional development of the receptive and expressive lexicons between the ages 0,9 and 2;0 in the full-term (FT) and the very-low-birth-weight (VLBW) children who are acquiring Finnish. The associations between the expressive lexicon and grammar at 1;6 and 2;0 in the FT children were also studied. In addition, the language skills of the VLBW children at 2;0 were analysed, as well as the predictive value of early lexicon to the later language performance. Four groups took part in the studies: the longitudinal (N = 35) and cross-sectional (N = 146) samples of the FT children, and the longitudinal (N = 32) and cross-sectional (N = 66) samples of VLBW children. The data was gathered by applying of the structured parental rating method (the Finnish version of the Communicative Development Inventory), through analysis of the children´s spontaneous speech and by administering a a formal test (Reynell Developmental Language Scales). The FT children acquired their receptive lexicons earlier, at a faster rate and with larger individual variation than their expressive lexicons. The acquisition rate of the expressive lexicon increased from slow to faster in most children (91%). Highly parallel developmental paths for lexical semantic categories were detected in the receptive and expressive lexicons of the Finnish children when they were analysed in relation to the growth of the lexicon size, as described in the literature for children acquiring other languages. The emergence of grammar was closely associated with expressive lexical growth. The VLBW children acquired their receptive lexicons at a slower rate and had weaker language skills at 2;0 than the full-term children. The compositional development of both lexicons happened at a slower rate in the VLBW children when compared to the FT controls. However, when the compositional development was analysed in relation to the growth of lexicon size, this development occurred qualitatively in a nearly parallel manner in the VLBW children as in the FT children. Early receptive and expressive lexicon sizes were significantly associated with later language skills in both groups. The effect of the background variables (gender, length of the mother s basic education, birth weight) on the language development in the FT and the VLBW children differed. The results provide new information of early language acquisition by the Finnish FT and VLBW children. The results support the view that the early acquisition of the semantic lexical categories is related to lexicon growth. The current findings also propose that the early grammatical acquisition is closely related to the growth of expressive vocabulary size. The language development of the VLBW children should be followed in clinical work.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

There are numerous formats for writing spellcheckers for open-source systems and there are many descriptions for languages written in these formats. Similarly, for word hyphenation by computer there are TEX rules for many languages. In this paper we demonstrate a method for converting these spell-checking lexicons and hyphenation rule sets into finite-state automata, and present a new finite-state based system for writer’s tools used in current open-source software such as Firefox, OpenOffice.org and enchant via the spell-checking library voikko.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this article we describe the methodology developed for the semiautomatic annotation of EPEC-RolSem, a Basque corpus labeled at predicate level following the PropBank-VerbNet model. The methodology presented is the product of detailed theoretical study of the semantic nature of verbs in Basque and of their similarities and differences with verbs in other languages. As part of the proposed methodology, we are creating a Basque lexicon on the PropBank-VerbNet model that we have named the Basque Verb Index (BVI). Our work thus dovetails the general trend toward building lexicons from tagged corpora that is clear in work conducted for other languages. EPEC-RolSem and BVI are two important resources for the computational semantic processing of Basque; as far as the authors are aware, they are also the first resources of their kind developed for Basque. In addition, each entry in BVI is linked to the corresponding verb-entry in well-known resources like PropBank, VerbNet, WordNet, Levin’s Classification and FrameNet. We have also implemented several automatic processes to aid in creating and annotating the BVI, including processes designed to facilitate the task of manual annotation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Research in emotion analysis of text suggest that emotion lexicon based features are superior to corpus based n-gram features. However the static nature of the general purpose emotion lexicons make them less suited to social media analysis, where the need to adopt to changes in vocabulary usage and context is crucial. In this paper we propose a set of methods to extract a word-emotion lexicon automatically from an emotion labelled corpus of tweets. Our results confirm that the features derived from these lexicons outperform the standard Bag-of-words features when applied to an emotion classification task. Furthermore, a comparative analysis with both manually crafted lexicons and a state-of-the-art lexicon generated using Point-Wise Mutual Information, show that the lexicons generated from the proposed methods lead to significantly better classi- fication performance.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We present an unsupervised learning algorithm that acquires a natural-language lexicon from raw speech. The algorithm is based on the optimal encoding of symbol sequences in an MDL framework, and uses a hierarchical representation of language that overcomes many of the problems that have stymied previous grammar-induction procedures. The forward mapping from symbol sequences to the speech stream is modeled using features based on articulatory gestures. We present results on the acquisition of lexicons and language models from raw speech, text, and phonetic transcripts, and demonstrate that our algorithm compares very favorably to other reported results with respect to segmentation performance and statistical efficiency.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The present research explores the degree of morphological structure of compound words in the native and nonnative lexicons, and provides additional data on the access to these representations. Native and nonnative speakers (L1 Spanish) of English were tested using a lexical decision task with masked priming of the compound’s constituents in isolation, including two orthographic conditions to control for a potential orthographic locus of effects. Both groups displayed reliable priming effects, unmediated by semantics, for the morphological but not the orthographic conditions as compared to an unrelated baseline. Results contribute further evidence of morphological structure in the lexicon of native speakers, and suggest that lexical representation and access in a second language are qualitatively comparable at relatively advanced levels of proficiency.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Princeton WordNet (WN.Pr) lexical database has motivated efficient compilations of bulky relational lexicons since its inception in the 1980's. The EuroWordNet project, the first multilingual initiative built upon WN.Pr, opened up ways of building individual wordnets, and interrelating them by means of the so-called Inter-Lingual-Index, an unstructured list of the WN.Pr synsets. Other important initiative, relying on a slightly different method of building multilingual wordnets, is the MultiWordNet project, where the key strategy is building language specific wordnets keeping as much as possible of the semantic relations available in the WN.Pr. This paper, in particular, stresses that the additional advantage of using WN.Pr lexical database as a resource for building wordnets for other languages is to explore possibilities of implementing an automatic procedure to map the WN.Pr conceptual relations as hyponymy, co-hyponymy, troponymy, meronymy, cause, and entailment onto the lexical database of the wordnet under construction, a viable possibility, for those are language-independent relations that hold between lexicalized concepts, not between lexical units. Accordingly, combining methods from both initiatives, this paper presents the ongoing implementation of the WN.Br lexical database and the aforementioned automation procedure illustrated with a sample of the automatic encoding of the hyponymy and co-hyponymy relations.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The production of a dictionary always generates controversy about decisions to be taken by the lexicographer. They are based – a priori – in previous theoretical and methodological choices. What we mean by dictionary? Why making certain choices rather than others? How to reconcile (if there is a reconciliation) the different approaches to describe the lexicon? The objective here is to contribute with theoretical and methodological reflections related to the Juruna Lexicography (Yudjá), as well as for lexicographical studies camp. This text addresses critical points of dictionaries production processes – if we may so call it – the history of the act of making a dictionary, so we can discuss choices to be taken to the entries of verbs in the Juruna-Portuguese bilingual dictionary assembly provided as a long term project result, in which some collaborators are working, including community’s indigenous. The work contains sections that will raise historical and linguistic discussions about the compilation of a dictionary and how this act binds to the applicability for the subjects that use this instrument firming / mobilizer of the lexicon visions. The focus here will be to discuss the verbs entries in the Juruna dictionary (stemming), taking lexicographical history as a contributor to certain choices of dictionaries production nowadays, whether for mother tongue, for foreign language, specialized lexicons, semantic groups, and systematizations for languages that are starting/beginning a first publication of dictionaries

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This article discusses the dialogism in Mikhail Bakhtin and the grounds of the linguistic sign in Umberto Eco, with the intention to use the themes and authors, to support the teaching-learning methodologies of foreign language (English and Spanish) at the public school of São Paulo state. The conceptual approach of the two authors allows us to infer that learning a foreign language is effected by the appropriation of utterances and cultural knowledge, pedagogical concept that confronts the traditional method used in the São Paulo school, which is based mainly on grammar teaching and lexicons. The paper derives the theoretical research used to support a dissertation, posing and evaluates preliminary, the integration of traditional theaters in foreign language in public schools, with digital environments (in online courses), and also the educational effects -the use of audiovisual material at the classroom and online learning.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Il vocabolario del potere fra intento etico-morale e tutela sociale. I lemmi dei Capitolari Carolingi nel Regnum Italicum (774-813), costituisce il risultato di una ricerca condotta sul lessico della legislazione carolingia promulgata per il Regno Italico dal momento della conquista dei Franchi sino alla morte di Carlomagno. L’analisi ha preso in esame tutti i lemmi, soprattutto sostantivi e aggettivi, riconducibili alla sfera etica e morale, e alla concezione della libertà della persona. Il lavoro si è giovato delle analisi più specifiche in merito ai concetti giuridico-istituzionali che fonti normative come quelle prese in esame portano inevitabilmente in primo piano. La ricerca, partita da una completa catalogazione dei lemmi, si è concentrata su quelli che maggiormente consentissero di valutare le interazioni fra la corte intellettuale dei primi carolingi – formata come noto da uomini di chiesa – e le caratteristiche di pensiero di quegli uomini, un pensiero sociale e istituzionale insieme. Il lavoro ha analizzato un lessico specifico per indagare come la concezione tradizionale della societas Christiana si esprimesse nella legislazione attraverso lemmi ed espressioni formulari peculiari: la scelta di questi da parte del Rex e della sua cerchia avrebbe indicato alla collettività una pacifica convivenza e definito contestualmente “l’intento ordinatore e pacificatore” del sovrano. L’analisi è stata condotta su un periodo breve ma assai significativo – un momento di frattura politica importante – per cogliere, proprio sfruttando la sovrapposizione e talvolta lo scontro fra i diversi usi di cancelleria del regno longobardo prima e carolingio poi, la volontarietà o meno da parte dei sovrani nell’uso di un lessico specifico. Questo diventa il problema centrale della tesi: tale lessico impone con la sua continuità d’uso modelli politici o invece è proprio un uso consapevole e strumentale di un determinato apparato lessicale che intende imporre alla società nuovi modelli di convivenza?

Relevância:

10.00% 10.00%

Publicador:

Resumo:

El polvo de ajo (Allium sativum L.) es una alternativa para conservar en el tiempo sus propiedades sensoriales y prolongar su vida útil como alimento procesado. En la actualidad, no existe una definición clara de las propiedades sensoriales que caracterizan el ajo ni de las técnicas más adecuadas para su análisis. Los objetivos del presente trabajo fueron estudiar diferentes vehículos y determinar el más apropiado para el análisis sensorial del polvo de ajo, y generar y definir los descriptores para las propiedades sensoriales de olor y flavor de diferentes cultivares deshidratados a través de dos métodos: en estufa a 50°C y por liofilización a -50°C, bajo vacío. Se pretende contribuir a la caracterización de este producto aportando un vocabulario específico y sus definiciones, como así también una metodología sensorial propia. Ocho evaluadores, seleccionados y entrenados de acuerdo con las normas internacionales y con experiencia en análisis sensorial, probaron diferentes vehículos y una vez determinado el más adecuado, desarrollaron el lenguaje descriptivo para los ajos desecados y liofilizados seleccionando por consenso los descriptores que mejor caracterizaban las cultivares, y se definió cada término. Se generaron 31 descriptores simples. Si bien, algunos de los descriptores coincidieron con los publicados en la guía ASTM DS 66 (1996) para ajos frescos, con esta investigación se aportó un amplio número de términos nuevos para la descripción del olor y el flavor de los ajos desecados y liofilizados, los cuales contribuyen a una mejor caracterización sensorial de este producto.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Many attempts have been made to provide multilinguality to the Semantic Web, by means of annotation properties in Natural Language (NL), such as RDFs or SKOS labels, and other lexicon-ontology models, such as lemon, but there are still many issues to be solved if we want to have a truly accessible Multilingual Semantic Web (MSW). Reusability of monolingual resources (ontologies, lexicons, etc.), accessibility of multilingual resources hindered by many formats, reliability of ontological sources, disambiguation problems and multilingual presentation to the end user of all this information in NL can be mentioned as some of the most relevant problems. Unless this NL presentation is achieved, MSW will be restricted to the limits of IT experts, but even so, with great dissatisfaction and disenchantment

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we describe the specification of amodel for the semantically interoperable representation of language resources for sentiment analysis. The model integrates "lemon", an RDF-based model for the specification of ontology-lexica (Buitelaar et al. 2009), which is used increasinglyfor the representation of language resources asLinked Data, with Marl, an RDF-based model for the representation of sentiment annotations (West-erski et al., 2011; Sánchez-Rada et al., 2013)