996 resultados para Corpora (Linguistics)


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Esta investigación es una tesis doctoral que tiene como objetivo el estudio de la evolución semántica de los verbos entrar y salir desde el punto de vista diacrónico en español medieval mediante la aplicación de las herramientas teóricas proporcionadas por la Lingüística cognitiva. A través de un corpus de documentaciones extraídas del Corpus del nuevo diccionario histórico del español de la Real Academia Española y que abarcan el intervalo desde el siglo XIII al XV, se analizan los valores semánticos de ambos verbos con la finalidad de determinar cómo fue desarrollándose su uso, qué significados generaron otros nuevos y cómo se relacionan entre sí los significados de un mismo verbo. Para llevar a cabo esta tarea se ha partido de la Teoría de los prototipos aplicada a la categorización léxica, así como de la Teoría de la metáfora y la metonimia. Asimismo, se ha analizado la estructura sintáctica de cada ejemplo, la selección léxica de los argumentos verbales y la tradición discursiva a la que pertenece cada documentación. De esta forma, es posible, no sólo describir la evolución semántica de ambos verbos, sino completar la información sobre las causas de su desarrollo diacrónico. Finalmente, se establecen las relaciones semánticas de carácter diacrónico existentes entre ambos verbos en tanto que miembros de un mismo grupo léxico, los verbos de movimiento, y se corrobora su vinculación más allá de una relación de antonimia de tipo primario.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

L'objectiu bàsic del TFC és aprofundir en l'estudi diacrònic des del llatí al català del canvi de les desinències -am/au del Present d'Indicatiu de la 1a conjugació a -em/eu en la major part del català. L'estudi té dues parts diferenciades. Una primera part dóna compte de l'evolució etimològica d'aquestes formes del Present d'Indicatiu fins a l'època del català medieval. I una segona part, que és la central, estudia el procés mitjançant el qual aquestes formes arizotòniques del Present d'Indicatiu de la 1a conjugació han convergit amb el Present de Subjuntiu i l'Imperatiu en la major part del territori de parla catalana.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Résumé : La question du texte, négligée par la plupart des théories de l'argumentation, est ici abordée frontalement. L'étude est exemplifiée par le cas d'un genre journalistique de l'argumentation : une tribune libre écrite par un astro-physicien qui pose un certain nombre de problèmes intéressants. D'un point de vue méthodologique, la distinction de trois niveaux d'analyse est conforme aux grandes options de la linguistique textuelle : niveaux périodique micro-textuel de l'argumentation, niveau méso-textuel des « cellules argumentatives » et niveau macro-textuel du plan de texte. Ce dernier niveau ouvre sur la généricité complexe d'un texte qui relève à la fois d'un genre de l'opinion journalistique et du genre de la controverse polémique entre scientifiques à propos du traitement médiatique prématuré d'une découverte scientifique.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Estudi sobre la negació en el castellà antic des d'un punt de vista sincrònic i dins el marc de la gramàtica generativa

Relevância:

10.00% 10.00%

Publicador:

Resumo:

El sintagma determinant en el castellà antic medieval

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Abstract: To cluster textual sequence types (discourse types/modes) in French texts, K-means algorithm with high-dimensional embeddings and fuzzy clustering algorithm were applied on clauses whose POS (part-ofspeech) n-gram profiles were previously extracted. Uni-, bi- and trigrams were used on four 19th century French short stories by Maupassant. For high-dimensional embeddings, power transformations on the chi-squared distances between clauses were explored. Preliminary results show that highdimensional embeddings improve the quality of clustering, contrasting the use of bi and trigrams whose performance is disappointing, possibly because of feature space sparsity.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper deals with the form and use of reformulation markers in research papers written in English, Spanish and Catalan. Considering the form and frequency of themarkers, English papers tends to prefer simple fixed markers and includes less reformulators than Spanish and Catalan. On the contrary, formal Catalan and Spanish papers include more markers, some of which are complex and allow for some structural variability. As for use, reformulation markers establish dynamic relationships between portions of discourse which can be identified in our corpus with expansion, reduction, and permutation. The analysis of the corpus shows that English authors usually reformulate to add more information to the concept (expansion), whereas Catalan and Spanish authors reduce the contents or the implicatures of the previous formulation more frequently than English.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we present a description of the role of definitional verbal patterns for the extraction of semantic relations. Several studies show that semantic relations can be extracted from analytic definitions contained in machine-readable dictionaries (MRDs). In addition, definitions found in specialised texts are a good starting point to search for different types of definitions where other semantic relations occur. The extraction of definitional knowledge from specialised corpora represents another interesting approach for the extraction of semantic relations. Here, we present a descriptive analysis of definitional verbal patterns in Spanish and the first steps towards the development of a system for the automatic extraction of definitional knowledge.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper aims at defending the necessity of creating applicationsfrom studies on discourse, so that applied linguistics becomes auseful tool for the society. This work explains briefly what reformulation connectors are and it shows the electronic prototype (than can be consulted by internet) of a tool designed to make the work of lexicographerseasy concerning the entrance of connectors in the dictionaries (ALCOR), by making emphasis on the theoretical base on which it is sustained and on the decisions of application that have been taken into account.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Acquiring lexical information is a complex problem, typically approached by relying on a number of contexts to contribute information for classification. One of the first issues to address in this domain is the determination of such contexts. The work presented here proposes the use of automatically obtained FORMAL role descriptors as features used to draw nouns from the same lexical semantic class together in an unsupervised clustering task. We have dealt with three lexical semantic classes (HUMAN, LOCATION and EVENT) in English. The results obtained show that it is possible to discriminate between elements from different lexical semantic classes using only FORMAL role information, hence validating our initial hypothesis. Also, iterating our method accurately accounts for fine-grained distinctions within lexical classes, namely distinctions involving ambiguous expressions. Moreover, a filtering and bootstrapping strategy employed in extracting FORMAL role descriptors proved to minimize effects of sparse data and noise in our task.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Automatic creation of polarity lexicons is a crucial issue to be solved in order to reduce time andefforts in the first steps of Sentiment Analysis. In this paper we present a methodology based onlinguistic cues that allows us to automatically discover, extract and label subjective adjectivesthat should be collected in a domain-based polarity lexicon. For this purpose, we designed abootstrapping algorithm that, from a small set of seed polar adjectives, is capable to iterativelyidentify, extract and annotate positive and negative adjectives. Additionally, the methodautomatically creates lists of highly subjective elements that change their prior polarity evenwithin the same domain. The algorithm proposed reached a precision of 97.5% for positiveadjectives and 71.4% for negative ones in the semantic orientation identification task.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Lexical Resources are a critical component for Natural Language Processing applications. However, the high cost of comparing and merging different resources has been a bottleneck to have richer resources with a broad range of potential uses for a significant number of languages.With the objective of reducing cost byeliminating human intervention, we present a new method for automating the merging of resources,with special emphasis in what we call the mapping step. This mapping step, which converts the resources into a common format that allows latter the merging, is usually performed with huge manual effort and thus makes the whole process very costly. Thus, we propose a method to perform this mapping fully automatically. To test our method, we have addressed the merging of two verb subcategorization frame lexica for Spanish, The resultsachieved, that almost replicate human work, demonstrate the feasibility of the approach.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this work we present the results of experimental work on the development of lexical class-based lexica by automatic means. Our purpose is to assess the use of linguistic lexical-class based information as a feature selection methodology for the use of classifiers in quick lexical development. The results show that the approach can help reduce the human effort required in the development of language resources significantly.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Lexical Resources are a critical component for Natural Language Processing applications. However, the high cost of comparing and merging different resources has been a bottleneck to obtain richer resources and a broader range of potential uses for a significant number of languages. With the objective of reducing cost by eliminating human intervention, we present a new method towards the automatic merging of resources. This method includes both, the automatic mapping of resources involved to a common format and merging them, once in this format. This paper presents how we have addressed the merging of two verb subcategorization frame lexica for Spanish, but our method will be extended to cover other types of Lexical Resources. The achieved results, that almost replicate human work, demonstrate the feasibility of the approach.