990 resultados para polarity, sentiment analysis chat NLP word2vec wordembedding RNNLM liblinear
Resumo:
The present is marked by the availability of large volumes of heterogeneous data, whose management is extremely complex. While the treatment of factual data has been widely studied, the processing of subjective information still poses important challenges. This is especially true in tasks that combine Opinion Analysis with other challenges, such as the ones related to Question Answering. In this paper, we describe the different approaches we employed in the NTCIR 8 MOAT monolingual English (opinionatedness, relevance, answerness and polarity) and cross-lingual English-Chinese tasks, implemented in our OpAL system. The results obtained when using different settings of the system, as well as the error analysis performed after the competition, offered us some clear insights on the best combination of techniques, that balance between precision and recall. Contrary to our initial intuitions, we have also seen that the inclusion of specialized Natural Language Processing tools dealing with Temporality or Anaphora Resolution lowers the system performance, while the use of topic detection techniques using faceted search with Wikipedia and Latent Semantic Analysis leads to satisfactory system performance, both for the monolingual setting, as well as in a multilingual one.
Resumo:
El campo de procesamiento de lenguaje natural (PLN), ha tenido un gran crecimiento en los últimos años; sus áreas de investigación incluyen: recuperación y extracción de información, minería de datos, traducción automática, sistemas de búsquedas de respuestas, generación de resúmenes automáticos, análisis de sentimientos, entre otras. En este artículo se presentan conceptos y algunas herramientas con el fin de contribuir al entendimiento del procesamiento de texto con técnicas de PLN, con el propósito de extraer información relevante que pueda ser usada en un gran rango de aplicaciones. Se pueden desarrollar clasificadores automáticos que permitan categorizar documentos y recomendar etiquetas; estos clasificadores deben ser independientes de la plataforma, fácilmente personalizables para poder ser integrados en diferentes proyectos y que sean capaces de aprender a partir de ejemplos. En el presente artículo se introducen estos algoritmos de clasificación, se analizan algunas herramientas de código abierto disponibles actualmente para llevar a cabo estas tareas y se comparan diversas implementaciones utilizando la métrica F en la evaluación de los clasificadores.
imaxin|software: PLN aplicada a la mejora de la comunicación multilingüe de empresas e instituciones
Resumo:
imaxin|software es una empresa creada en 1997 por cuatro titulados en ingeniería informática cuyo objetivo ha sido el de desarrollar videojuegos multimedia educativos y procesamiento del lenguaje natural multilingüe. 17 años más tarde, hemos desarrollado recursos, herramientas y aplicaciones multilingües de referencia para diferentes lenguas: Portugués (Galicia, Portugal, Brasil, etc.), Español (España, Argentina, México, etc.), Inglés, Catalán y Francés. En este artículo haremos una descripción de aquellos principales hitos en relación a la incorporación de estas tecnologías PLN al sector industrial e institucional.
Resumo:
En los países democráticos, conocer la intención de voto de los ciudadanos y las valoraciones de los principales partidos y líderes políticos es de gran interés tanto para los propios partidos como para los medios de comunicación y el público en general. Para ello se han utilizado tradicionalmente costosas encuestas personales. El auge de las redes sociales, principalmente Twitter, permite pensar en ellas como una alternativa barata a las encuestas. En este trabajo, revisamos la bibliografía científica más relevante en este ámbito, poniendo especial énfasis en el caso español.
Resumo:
In this work we present a semantic framework suitable of being used as support tool for recommender systems. Our purpose is to use the semantic information provided by a set of integrated resources to enrich texts by conducting different NLP tasks: WSD, domain classification, semantic similarities and sentiment analysis. After obtaining the textual semantic enrichment we would be able to recommend similar content or even to rate texts according to different dimensions. First of all, we describe the main characteristics of the semantic integrated resources with an exhaustive evaluation. Next, we demonstrate the usefulness of our resource in different NLP tasks and campaigns. Moreover, we present a combination of different NLP approaches that provide enough knowledge for being used as support tool for recommender systems. Finally, we illustrate a case of study with information related to movies and TV series to demonstrate that our framework works properly.
Resumo:
Sentiment analysis has long focused on binary classification of text as either positive or negative. There has been few work on mapping sentiments or emotions into multiple dimensions. This paper studies a Bayesian modeling approach to multi-class sentiment classification and multidimensional sentiment distributions prediction. It proposes effective mechanisms to incorporate supervised information such as labeled feature constraints and document-level sentiment distributions derived from the training data into model learning. We have evaluated our approach on the datasets collected from the confession section of the Experience Project website where people share their life experiences and personal stories. Our results show that using the latent representation of the training documents derived from our approach as features to build a maximum entropy classifier outperforms other approaches on multi-class sentiment classification. In the more difficult task of multi-dimensional sentiment distributions prediction, our approach gives superior performance compared to a few competitive baselines. © 2012 ACM.
Resumo:
Sentiment analysis or opinion mining aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text. This paper proposes a novel probabilistic modeling framework called joint sentiment-topic (JST) model based on latent Dirichlet allocation (LDA), which detects sentiment and topic simultaneously from text. A reparameterized version of the JST model called Reverse-JST, obtained by reversing the sequence of sentiment and topic generation in the modeling process, is also studied. Although JST is equivalent to Reverse-JST without a hierarchical prior, extensive experiments show that when sentiment priors are added, JST performs consistently better than Reverse-JST. Besides, unlike supervised approaches to sentiment classification which often fail to produce satisfactory performance when shifting to other domains, the weakly supervised nature of JST makes it highly portable to other domains. This is verified by the experimental results on data sets from five different domains where the JST model even outperforms existing semi-supervised approaches in some of the data sets despite using no labeled documents. Moreover, the topics and topic sentiment detected by JST are indeed coherent and informative. We hypothesize that the JST model can readily meet the demand of large-scale sentiment analysis from the web in an open-ended fashion.
Resumo:
With the development of social media tools such as Facebook and Twitter, mainstream media organizations including newspapers and TV media have played an active role in engaging with their audience and strengthening their influence on the recently emerged platforms. In this paper, we analyze the behavior of mainstream media on Twitter and study how they exert their influence to shape public opinion during the UK's 2010 General Election. We first propose an empirical measure to quantify mainstream media bias based on sentiment analysis and show that it correlates better with the actual political bias in the UK media than the pure quantitative measures based on media coverage of various political parties. We then compare the information diffusion patterns from different categories of sources. We found that while mainstream media is good at seeding prominent information cascades, its role in shaping public opinion is being challenged by journalists since tweets from them are more likely to be retweeted and they spread faster and have longer lifespan compared to tweets from mainstream media. Moreover, the political bias of the journalists is a good indicator of the actual election results. Copyright 2013 ACM.
Resumo:
he push to widen participation in public consultation suggests social media as an additional mechanism through which to engage the public. Bioenergy companies need to build their capacity to communicate in these new media and to monitor the attitudes of the public and opposition organisations towards energy development projects. Design/methodology/approach This short paper outlines the planning issues bioenergy developments face and the main methods of communication used in the public consultation process in the UK. The potential role of social media in communication with stakeholders is identified. The capacity of sentiment analysis to mine opinions from social media is summarised, and illustrated using a sample of tweets containing the term ‘bioenergy’ Findings Social media have the potential to improve information flows between stakeholders and developers. Sentiment analysis is a viable Purpose The push to widen participation in public consultation suggests social media as an additional mechanism through which to engage the public. Bioenergy companies need to build their capacity to communicate in these new media and to monitor the attitudes of the public and opposition organisations towards energy development projects. Design/methodology/approach This short paper outlines the planning issues bioenergy developments face and the main methods of communication used in the public consultation process in the UK. The potential role of social media in communication with stakeholders is identified. The capacity of sentiment analysis to mine opinions from social media is summarised, and illustrated using a sample of tweets containing the term ‘bioenergy’ Findings Social media have the potential to improve information flows between stakeholders and developers. Sentiment analysis is a viable methodology, which bioenergy companies should be using to measure public opinion in the consultation process. Preliminary analysis shows promising results. Research limitations/implications Analysis is preliminary and based on a small dataset. It is intended only to illustrate the potential of sentiment analysis and not to draw general conclusions about the bioenergy sector. Originality/value Opinion mining, though established in marketing and political analysis, is not yet systematically applied as a planning consultation tool. This is a missed opportunity.
Resumo:
Opinion mining and sentiment analysis are important research areas of Natural Language Processing (NLP) tools and have become viable alternatives for automatically extracting the affective information found in texts. Our aim is to build an NLP model to analyze gamers’ sentiments and opinions expressed in a corpus of 9750 game reviews. A Principal Component Analysis using sentiment analysis features explained 51.2 % of the variance of the reviews and provides an integrated view of the major sentiment and topic related dimensions expressed in game reviews. A Discriminant Function Analysis based on the emerging components classified game reviews into positive, neutral and negative ratings with a 55 % accuracy.
Resumo:
In this paper we introduce the online version of our ReaderBench framework, which includes multi-lingual comprehension-centered web services designed to address a wide range of individual and collaborative learning scenarios, as follows. First, students can be engaged in reading a course material, then eliciting their understanding of it; the reading strategies component provides an in-depth perspective of comprehension processes. Second, students can write an essay or a summary; the automated essay grading component provides them access to more than 200 textual complexity indices covering lexical, syntax, semantics and discourse structure measurements. Third, students can start discussing in a chat or a forum; the Computer Supported Collaborative Learning (CSCL) component provides indepth conversation analysis in terms of evaluating each member’s involvement in the CSCL environments. Eventually, the sentiment analysis, as well as the semantic models and topic mining components enable a clearer perspective in terms of learner’s points of view and of underlying interests.
Resumo:
Market research is often conducted through conventional methods such as surveys, focus groups and interviews. But the drawbacks of these methods are that they can be costly and timeconsuming. This study develops a new method, based on a combination of standard techniques like sentiment analysis and normalisation, to conduct market research in a manner that is free and quick. The method can be used in many application-areas, but this study focuses mainly on the veganism market to identify vegan food preferences in the form of a profile. Several food words are identified, along with their distribution between positive and negative sentiments in the profile. Surprisingly, non-vegan foods such as cheese, cake, milk, pizza and chicken dominate the profile, indicating that there is a significant market for vegan-suitable alternatives for such foods. Meanwhile, vegan-suitable foods such as coconut, potato, blueberries, kale and tofu also make strong appearances in the profile. Validation is performed by using the method on Volkswagen vehicle data to identify positive and negative sentiment across five car models. Some results were found to be consistent with sales figures and expert reviews, while others were inconsistent. The reliability of the method is therefore questionable, so the results should be used with caution.
Resumo:
Intersubjectivity is an important concept in psychology and sociology. It refers to sharing conceptualizations through social interactions in a community and using such shared conceptualization as a resource to interpret things that happen in everyday life. In this work, we make use of intersubjectivity as the basis to model shared stance and subjectivity for sentiment analysis. We construct an intersubjectivity network which links review writers, terms they used, as well as the polarities of the terms. Based on this network model, we propose a method to learn writer embeddings which are subsequently incorporated into a convolutional neural network for sentiment analysis. Evaluations on the IMDB, Yelp 2013 and Yelp 2014 datasets show that the proposed approach has achieved the state-of-the-art performance.
Resumo:
Generating personalized movie recommendations to users is a problem that most commonly relies on user-movie ratings. These ratings are generally used either to understand the user preferences or to recommend movies that users with similar rating patterns have rated highly. However, movie recommenders are often subject to the Cold-Start problem: new movies have not been rated by anyone, so, they will not be recommended to anyone; likewise, the preferences of new users who have not rated any movie cannot be learned. In parallel, Social-Media platforms, such as Twitter, collect great amounts of user feedback on movies, as these are very popular nowadays. This thesis proposes to explore feedback shared on Twitter to predict the popularity of new movies and show how it can be used to tackle the Cold-Start problem. It also proposes, at a finer grain, to explore the reputation of directors and actors on IMDb to tackle the Cold-Start problem. To assess these aspects, a Reputation-enhanced Recommendation Algorithm is implemented and evaluated on a crawled IMDb dataset with previous user ratings of old movies,together with Twitter data crawled from January 2014 to March 2014, to recommend 60 movies affected by the Cold-Start problem. Twitter revealed to be a strong reputation predictor, and the Reputation-enhanced Recommendation Algorithm improved over several baseline methods. Additionally, the algorithm also proved to be useful when recommending movies in an extreme Cold-Start scenario, where both new movies and users are affected by the Cold-Start problem.
Resumo:
This thesis does not set out to focus on the dynamics relationship between Twitter and stock prices, but instead tries to understand if using relevant information extracted from tweets has the power to increase investors’ stock picking ability, and generate alpha in portfolio’s choice relative to a benchmark. Despite the short period analyzed, it gives promising results that the sentiment analysis performed by Social Market Analytics Inc. applied to an equity portfolio, is able to generate positive abnormal returns, statistically significant in and out of sample.