914 resultados para Sentiment de compétence


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper describes our participation at SemEval- 2014 sentiment analysis task, in both contextual and message polarity classification. Our idea was to com- pare two different techniques for sentiment analysis. First, a machine learning classifier specifically built for the task using the provided training corpus. On the other hand, a lexicon-based approach using natural language processing techniques, developed for a ge- neric sentiment analysis task with no adaptation to the provided training corpus. Results, though far from the best runs, prove that the generic model is more robust as it achieves a more balanced evaluation for message polarity along the different test sets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents an approach to create what we have called a Unified Sentiment Lexicon (USL). This approach aims at aligning, unifying, and expanding the set of sentiment lexicons which are available on the web in order to increase their robustness of coverage. One problem related to the task of the automatic unification of different scores of sentiment lexicons is that there are multiple lexical entries for which the classification of positive, negative, or neutral {P, Z, N} depends on the unit of measurement used in the annotation methodology of the source sentiment lexicon. Our USL approach computes the unified strength of polarity of each lexical entry based on the Pearson correlation coefficient which measures how correlated lexical entries are with a value between 1 and -1, where 1 indicates that the lexical entries are perfectly correlated, 0 indicates no correlation, and -1 means they are perfectly inversely correlated and so is the UnifiedMetrics procedure for CPU and GPU, respectively. Another problem is the high processing time required for computing all the lexical entries in the unification task. Thus, the USL approach computes a subset of lexical entries in each of the 1344 GPU cores and uses parallel processing in order to unify 155802 lexical entries. The results of the analysis conducted using the USL approach show that the USL has 95.430 lexical entries, out of which there are 35.201 considered to be positive, 22.029 negative, and 38.200 neutral. Finally, the runtime was 10 minutes for 95.430 lexical entries; this allows a reduction of the time computing for the UnifiedMetrics by 3 times.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This approach aims at aligning, unifying and expanding the set of sentiment lexicons which are available on the web in order to increase their robustness of coverage. A sentiment lexicon is a critical and essential resource for tagging subjective corpora on the web or elsewhere. In many situations, the multilingual property of the sentiment lexicon is important because the writer is using two languages alternately in the same text, message or post. Our USL approach computes the unified strength of polarity of each lexical entry based on the Pearson correlation coefficient which measures how correlated lexical entries are with a value between 1 and -1, where 1 indicates that the lexical entries are perfectly correlated, 0 indicates no correlation, and -1 means they are perfectly inversely correlated and the UnifiedMetrics procedure for CPU and GPU, respectively.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Sentiment analysis has recently gained popularity in the financial domain thanks to its capability to predict the stock market based on the wisdom of the crowds. Nevertheless, current sentiment indicators are still silos that cannot be combined to get better insight about the mood of different communities. In this article we propose a Linked Data approach for modelling sentiment and emotions about financial entities. We aim at integrating sentiment information from different communities or providers, and complements existing initiatives such as FIBO. The ap- proach has been validated in the semantic annotation of tweets of several stocks in the Spanish stock market, including its sentiment information.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a methodology for legacy language resource adaptation that generates domain-specific sentiment lexicons organized around domain entities described with lexical information and sentiment words described in the context of these entities. We explain the steps of the methodology and we give a working example of our initial results. The resulting lexicons are modelled as Linked Data resources by use of established formats for Linguistic Linked Data (lemon, NIF) and for linked sentiment expressions (Marl), thereby contributing and linking to existing Language Resources in the Linguistic Linked Open Data cloud.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Sentiment and Emotion Analysis strongly depend on quality language resources, especially sentiment dictionaries. These resources are usually scattered, heterogeneous and limited to specific domains of appli- cation by simple algorithms. The EUROSENTIMENT project addresses these issues by 1) developing a common language resource representation model for sentiment analysis, and APIs for sentiment analysis services based on established Linked Data formats (lemon, Marl, NIF and ONYX) 2) by creating a Language Resource Pool (a.k.a. LRP) that makes avail- able to the community existing scattered language resources and services for sentiment analysis in an interoperable way. In this paper we describe the available language resources and services in the LRP and some sam- ple applications that can be developed on top of the EUROSENTIMENT LRP.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we present a dataset componsed of domain-specific sentiment lexicons in six languages for two domains. We used existing collections of reviews from Trip Advisor, Amazon, the Stanford Network Analysis Project and the OpinRank Review Dataset. We use an RDF model based on the lemon and Marl formats to represent the lexicons. We describe the methodology that we applied to generate the domain-specific lexicons and we provide access information to our datasets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis is the result of a project whose objective has been to develop and deploy a dashboard for sentiment analysis of football in Twitter based on web components and D3.js. To do so, a visualisation server has been developed in order to present the data obtained from Twitter and analysed with Senpy. This visualisation server has been developed with Polymer web components and D3.js. Data mining has been done with a pipeline between Twitter, Senpy and ElasticSearch. Luigi have been used in this process because helps building complex pipelines of batch jobs, so it has analysed all tweets and stored them in ElasticSearch. To continue, D3.js has been used to create interactive widgets that make data easily accessible, this widgets will allow the user to interact with them and �filter the most interesting data for him. Polymer web components have been used to make this dashboard according to Google's material design and be able to show dynamic data in widgets. As a result, this project will allow an extensive analysis of the social network, pointing out the influence of players and teams and the emotions and sentiments that emerge in a lapse of time.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Preliminary research demonstrated the EmotiBlog annotated corpus relevance as a Machine Learning resource to detect subjective data. In this paper we compare EmotiBlog with the JRC Quotes corpus in order to check the robustness of its annotation. We concentrate on its coarse-grained labels and carry out a deep Machine Learning experimentation also with the inclusion of lexical resources. The results obtained show a similarity with the ones obtained with the JRC Quotes corpus demonstrating the EmotiBlog validity as a resource for the SA task.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

EmotiBlog is a corpus labelled with the homonymous annotation schema designed for detecting subjectivity in the new textual genres. Preliminary research demonstrated its relevance as a Machine Learning resource to detect opinionated data. In this paper we compare EmotiBlog with the JRC corpus in order to check the EmotiBlog robustness of annotation. For this research we concentrate on its coarse-grained labels. We carry out a deep ML experimentation also with the inclusion of lexical resources. The results obtained show a similarity with the ones obtained with the JRC demonstrating the EmotiBlog validity as a resource for the SA task.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Comunicación presentada en las IV Jornadas TIMM, Torres (Jaén), 7-8 abril 2011.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the chemical textile domain experts have to analyse chemical components and substances that might be harmful for their usage in clothing and textiles. Part of this analysis is performed searching opinions and reports people have expressed concerning these products in the Social Web. However, this type of information on the Internet is not as frequent for this domain as for others, so its detection and classification is difficult and time-consuming. Consequently, problems associated to the use of chemical substances in textiles may not be detected early enough, and could lead to health problems, such as allergies or burns. In this paper, we propose a framework able to detect, retrieve, and classify subjective sentences related to the chemical textile domain, that could be integrated into a wider health surveillance system. We also describe the creation of several datasets with opinions from this domain, the experiments performed using machine learning techniques and different lexical resources such as WordNet, and the evaluation focusing on the sentiment classification, and complaint detection (i.e., negativity). Despite the challenges involved in this domain, our approach obtains promising results with an F-score of 65% for polarity classification and 82% for complaint detection.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recent years have witnessed a surge of interest in computational methods for affect, ranging from opinion mining, to subjectivity detection, to sentiment and emotion analysis. This article presents a brief overview of the latest trends in the field and describes the manner in which the articles contained in the special issue contribute to the advancement of the area. Finally, we comment on the current challenges and envisaged developments of the subjectivity and sentiment analysis fields, as well as their application to other Natural Language Processing tasks and related domains.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Ce mémoire a pour objet la mise à l’essai d’une séquence d’apprentissage intégrant des chansons comme sources primaires pour développer la compétence 2– interpréter la réalité sociale à l’aide de la méthode historique. Le ministère de l’Éducation du Québec et les écrits scientifiques (Côté, 2008; LENOIR et SAUVÉ, 2010; Turner-Bisset, 2001) s’attendent à ce que l’élève terminant ses études secondaires raisonne à partir de faits tirés des sources qui lui sont accessibles, notamment des sources primaires. Or, on constate trois lacunes dans la pratique enseignante : le petit nombre de sources travaillées, l’inégalité de leurs interprétations et la faiblesse de leurs critiques (Byrom, 2005; Pickles, 2010; Watson, 1998). Aussi, peu de cas utilisent la chanson comme source primaire. La séquence d’apprentissages sur la Deuxième Guerre mondiale que l’enseignante française Sylvaine Moreau (2012) a rendue disponible sur Internet a donc servi de point de départ à cette mise à l’essai afin de comprendre ce qu’il en est. Comme il y a un aller-retour régulier prévu entre l’adaptation du matériel pédagogique au contexte scolaire québécois et les observations en classe c’est la recherche-développement qui semble l’approche la plus efficace (Artigue, 1989; Harvey et Loiselle, 2009). Quatre enseignants montréalais ont accepté une entrevue avec l’auteure de cette recherche. Ils ont adapté le matériel au contexte scolaire, ils ont été observés en classe et les réponses écrites des élèves ont été analysées grâce, notamment, au programme N’Vivo. En explorant les données qualitatives recueillies, on constate le petit nombre de sources travaillées puisque les réponses ne reprennent que ce qui a été vu en classe, priorisant même certains types de sources. La faiblesse des critiques est criante puisque des étapes jugées « inutiles » par certains élèves sont laissées incomplètes. Finalement, l’auteure remarque l’inégalité des interprétations liée à une barrière de niveau de langue. Les métaphores et le vocabulaire de certaines chansons semblent un défi.