5 resultados para Identités genrées

em Universidad de Alicante


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Tesis doctoral con mención europea en procesamiento del lenguaje natural realizada en la Universidad de Alicante por Ester Boldrini bajo la dirección del Dr. Patricio Martínez-Barco. El acto de defensa de la tesis tuvo lugar en la Universidad de Alicante el 23 de enero de 2012 ante el tribunal formado por los doctores Manuel Palomar (Universidad de Alicante), Dr. Paloma Moreda (UA), Dr. Mariona Taulé (Universidad de Barcelona), Dr. Horacio Saggion (Universitat Pompeu Fabra) y Dr. Mike Thelwall (University of Wolverhampton). Calificación: Sobresaliente Cum Laude por unanimidad.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents the first version of EmotiBlog, an annotation scheme for emotions in non-traditional textual genres such as blogs or forums. We collected a corpus composed by blog posts in three languages: English, Spanish and Italian and about three topics of interest. Subsequently, we annotated our collection and carried out the inter-annotator agreement and a ten-fold cross-validation evaluation, obtaining promising results. The main aim of this research is to provide a finer-grained annotation scheme and annotated data that are essential to perform evaluation focused on checking the quality of the created resources.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a preliminary study in which Machine Learning experiments applied to Opinion Mining in blogs have been carried out. We created and annotated a blog corpus in Spanish using EmotiBlog. We evaluated the utility of the features labelled firstly carrying out experiments with combinations of them and secondly using the feature selection techniques, we also deal with several problems, such as the noisy character of the input texts, the small size of the training set, the granularity of the annotation scheme and the language object of our study, Spanish, with less resource than English. We obtained promising results considering that it is a preliminary study.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we present a Text Summarisation tool, compendium, capable of generating the most common types of summaries. Regarding the input, single- and multi-document summaries can be produced; as the output, the summaries can be extractive or abstractive-oriented; and finally, concerning their purpose, the summaries can be generic, query-focused, or sentiment-based. The proposed architecture for compendium is divided in various stages, making a distinction between core and additional stages. The former constitute the backbone of the tool and are common for the generation of any type of summary, whereas the latter are used for enhancing the capabilities of the tool. The main contributions of compendium with respect to the state-of-the-art summarisation systems are that (i) it specifically deals with the problem of redundancy, by means of textual entailment; (ii) it combines statistical and cognitive-based techniques for determining relevant content; and (iii) it proposes an abstractive-oriented approach for facing the challenge of abstractive summarisation. The evaluation performed in different domains and textual genres, comprising traditional texts, as well as texts extracted from the Web 2.0, shows that compendium is very competitive and appropriate to be used as a tool for generating summaries.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Le but de ce travail est d'analyser deux catégories de textes contenus dans le corpus pilote COMENEGO (Corpus Multilingüe de Economía y Negocios), à savoir les catégories organisationnelle et légale. Nous commençons par présenter brièvement le corpus en question ainsi que les motivations qui nous mènent à analyser ses contenus. Ensuite, nous sélectionnons une série de types de textes ou genres textuels de ces deux catégories afin de procéder à une analyse plus approfondie de chaque catégorie. Puis nous présentons les résultats obtenus qui montrent une certaine hétérogénéité notamment dans la catégorie organisationnelle du corpus. L'approche suivie ainsi que les résultats obtenus peuvent aider non seulement à reclasser les textes du corpus mais aussi à concevoir la plate-forme qui donnera accès aux textes sur internet.