848 resultados para opinion mining
Resumo:
Sentiment analysis or opinion mining aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text. This paper proposes a novel probabilistic modeling framework called joint sentiment-topic (JST) model based on latent Dirichlet allocation (LDA), which detects sentiment and topic simultaneously from text. A reparameterized version of the JST model called Reverse-JST, obtained by reversing the sequence of sentiment and topic generation in the modeling process, is also studied. Although JST is equivalent to Reverse-JST without a hierarchical prior, extensive experiments show that when sentiment priors are added, JST performs consistently better than Reverse-JST. Besides, unlike supervised approaches to sentiment classification which often fail to produce satisfactory performance when shifting to other domains, the weakly supervised nature of JST makes it highly portable to other domains. This is verified by the experimental results on data sets from five different domains where the JST model even outperforms existing semi-supervised approaches in some of the data sets despite using no labeled documents. Moreover, the topics and topic sentiment detected by JST are indeed coherent and informative. We hypothesize that the JST model can readily meet the demand of large-scale sentiment analysis from the web in an open-ended fashion.
Resumo:
Sentiment analysis or opinion mining aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text. This paper proposes a novel probabilistic modeling framework based on Latent Dirichlet Allocation (LDA), called joint sentiment/topic model (JST), which detects sentiment and topic simultaneously from text. Unlike other machine learning approaches to sentiment classification which often require labeled corpora for classifier training, the proposed JST model is fully unsupervised. The model has been evaluated on the movie review dataset to classify the review sentiment polarity and minimum prior information have also been explored to further improve the sentiment classification accuracy. Preliminary experiments have shown promising results achieved by JST.
Resumo:
Microposts are small fragments of social media content that have been published using a lightweight paradigm (e.g. Tweets, Facebook likes, foursquare check-ins). Microposts have been used for a variety of applications (e.g., sentiment analysis, opinion mining, trend analysis), by gleaning useful information, often using third-party concept extraction tools. There has been very large uptake of such tools in the last few years, along with the creation and adoption of new methods for concept extraction. However, the evaluation of such efforts has been largely consigned to document corpora (e.g. news articles), questioning the suitability of concept extraction tools and methods for Micropost data. This report describes the Making Sense of Microposts Workshop (#MSM2013) Concept Extraction Challenge, hosted in conjunction with the 2013 World Wide Web conference (WWW'13). The Challenge dataset comprised a manually annotated training corpus of Microposts and an unlabelled test corpus. Participants were set the task of engineering a concept extraction system for a defined set of concepts. Out of a total of 22 complete submissions 13 were accepted for presentation at the workshop; the submissions covered methods ranging from sequence mining algorithms for attribute extraction to part-of-speech tagging for Micropost cleaning and rule-based and discriminative models for token classification. In this report we describe the evaluation process and explain the performance of different approaches in different contexts.
Resumo:
he push to widen participation in public consultation suggests social media as an additional mechanism through which to engage the public. Bioenergy companies need to build their capacity to communicate in these new media and to monitor the attitudes of the public and opposition organisations towards energy development projects. Design/methodology/approach This short paper outlines the planning issues bioenergy developments face and the main methods of communication used in the public consultation process in the UK. The potential role of social media in communication with stakeholders is identified. The capacity of sentiment analysis to mine opinions from social media is summarised, and illustrated using a sample of tweets containing the term ‘bioenergy’ Findings Social media have the potential to improve information flows between stakeholders and developers. Sentiment analysis is a viable Purpose The push to widen participation in public consultation suggests social media as an additional mechanism through which to engage the public. Bioenergy companies need to build their capacity to communicate in these new media and to monitor the attitudes of the public and opposition organisations towards energy development projects. Design/methodology/approach This short paper outlines the planning issues bioenergy developments face and the main methods of communication used in the public consultation process in the UK. The potential role of social media in communication with stakeholders is identified. The capacity of sentiment analysis to mine opinions from social media is summarised, and illustrated using a sample of tweets containing the term ‘bioenergy’ Findings Social media have the potential to improve information flows between stakeholders and developers. Sentiment analysis is a viable methodology, which bioenergy companies should be using to measure public opinion in the consultation process. Preliminary analysis shows promising results. Research limitations/implications Analysis is preliminary and based on a small dataset. It is intended only to illustrate the potential of sentiment analysis and not to draw general conclusions about the bioenergy sector. Originality/value Opinion mining, though established in marketing and political analysis, is not yet systematically applied as a planning consultation tool. This is a missed opportunity.
Resumo:
Postprint
Resumo:
Opinion mining and sentiment analysis are important research areas of Natural Language Processing (NLP) tools and have become viable alternatives for automatically extracting the affective information found in texts. Our aim is to build an NLP model to analyze gamers’ sentiments and opinions expressed in a corpus of 9750 game reviews. A Principal Component Analysis using sentiment analysis features explained 51.2 % of the variance of the reviews and provides an integrated view of the major sentiment and topic related dimensions expressed in game reviews. A Discriminant Function Analysis based on the emerging components classified game reviews into positive, neutral and negative ratings with a 55 % accuracy.
Resumo:
The large number of opinions generated by online users made the former “word of mouth” find its way to virtual world. In addition to be numerous, many of the useful reviews are mixed with a large number of fraudulent, incomplete or duplicate reviews. However, how to find the features that influence on the number of votes received by an opinion and find useful reviews? The literature on opinion mining has several studies and techniques that are able to analyze of properties found in the text of reviews. This paper presents the application of a methodology for evaluation of usefulness of opinions with the aim of identifying which characteristics have more influence on the amount of votes: basic utility (e.g. ratings about the product and/or service, date of publication), textual (e.g.size of words, paragraphs) and semantics (e.g., the meaning of the words of the text). The evaluation was performed in a database extracted from TripAdvisor with opinionsabout hotels written in Portuguese. Results show that users give more attention to recent opinions with higher scores for value and location of the hotel and with lowest scores for sleep quality and service and cleanliness. Texts with positive opinions, small words, few adjectives and adverbs increase the chances of receiving more votes.
Resumo:
The growing availability and popularity of opinion rich resources on the online web resources, such as review sites and personal blogs, has made it convenient to find out about the opinions and experiences of layman people. But, simultaneously, this huge eruption of data has made it difficult to reach to a conclusion. In this thesis, I develop a novel recommendation system, Recomendr that can help users digest all the reviews about an entity and compare candidate entities based on ad-hoc dimensions specified by keywords. It expects keyword specified ad-hoc dimensions/features as input from the user and based on those features; it compares the selected range of entities using reviews provided on the related User Generated Contents (UGC) e.g. online reviews. It then rates the textual stream of data using a scoring function and returns the decision based on an aggregate opinion to the user. Evaluation of Recomendr using a data set in the laptop domain shows that it can effectively recommend the best laptop as per user-specified dimensions such as price. Recomendr is a general system that can potentially work for any entities on which online reviews or opinionated text is available.
Resumo:
La minería de opinión o análisis de sentimiento es un tipo de análisis de texto que pretende ayudar a la toma de decisiones a través de la extracción y el análisis de opiniones, identificando las opiniones positivas, negativas y neutras; y midiendo su repercusión en la percepción de un tópico. En este trabajo se propone un modelo de análisis de sentimiento basado en diccionarios, que a través de la semántica y de los patrones semánticos que conforman el texto a clasificar, permite obtener la polaridad del mismo, en la red social Twitter. Para el conjunto de datos de entrada al sistema se han considerado datos públicos obtenidos de la red social Twitter, de compañías del sector de las telecomunicaciones que operan en el mercado Español.
Resumo:
This article explores two matrix methods to induce the ``shades of meaning" (SoM) of a word. A matrix representation of a word is computed from a corpus of traces based on the given word. Non-negative Matrix Factorisation (NMF) and Singular Value Decomposition (SVD) compute a set of vectors corresponding to a potential shade of meaning. The two methods were evaluated based on loss of conditional entropy with respect to two sets of manually tagged data. One set reflects concepts generally appearing in text, and the second set comprises words used for investigations into word sense disambiguation. Results show that for NMF consistently outperforms SVD for inducing both SoM of general concepts as well as word senses. The problem of inducing the shades of meaning of a word is more subtle than that of word sense induction and hence relevant to thematic analysis of opinion where nuances of opinion can arise.
Resumo:
This paper uses innovative content analysis techniques to map how the death of Oscar Pistorius' girlfriend, Reeva Steenkamp, was framed on Twitter conversations. Around 1.5 million posts from a two-week timeframe are analyzed with a combination of syntactic and semantic methods. This analysis is grounded in the frame analysis perspective and is different than sentiment analysis. Instead of looking for explicit evaluations, such as “he is guilty” or “he is innocent”, we showcase through the results how opinions can be identified by complex articulations of more implicit symbolic devices such as examples and metaphors repeatedly mentioned. Different frames are adopted by users as more information about the case is revealed: from a more episodic one, highly used in the very beginning, to more systemic approaches, highlighting the association of the event with urban violence, gun control issues, and violence against women. A detailed timeline of the discussions is provided.
Resumo:
This paper examines the debate surrounding a recent decision made by the Ghanaian government to permit gold exploration - and potentially, mining - in 'protected' forest reserves. In 2001, four mining companies were awarded mineral exploration concessions in forested regions of the country, and have since put forward applications to mine for gold. Notwithstanding the sharp divide in opinion on the issue, the continued uncertainty surrounding the implications of the proposed activities makes further research on the ground imperative in the short term. Work aiming to elicit indigenous perspectives on the projects, as well as research that facilitates dialogue between and/or among stakeholder parties, should be prioritized.
Resumo:
This article critically explores the nature and purpose of relationships and inter-dependencies between stakeholders in the context of a parastatal chromite mining company in the Betsiboka Region of Northern Madagascar. An examination of the institutional arrangements at the interface between the mining company and local communities identified power hierarchies and dependencies in the context of a dominant paternalistic environment. The interactions, inter alia, limited social cohesion and intensified the fragility and weakness of community representation, which was further influenced by ethnic hierarchies between the varied community groups; namely, indigenous communities and migrants to the area from different ethnic groups. Moreover, dependencies and nepotism, which may exist at all institutional levels, can create civil society stakeholder representatives who are unrepresentative of the society they are intended to represent. Similarly, a lack of horizontal and vertical trust and reciprocity inherent in Malagasy society engenders a culture of low expectations regarding transparency and accountability, which further catalyses a cycle of nepotism and elite rent-seeking behaviour. On the other hand, leaders retain power with minimal vertical delegation or decentralisation of authority among levels of government and limit opportunities to benefit the elite, perpetuating rent-seeking behaviour within the privileged minority. Within the union movement, pluralism and the associated politicisation of individual unions restricts solidarity, which impacts on the movement’s capacity to act as a cohesive body of opinion and opposition. Nevertheless, the unions’ drive to improve their social capital has increased expectations of transparency and accountability, resulting in demands for greater engagement in decision-making processes.
Resumo:
Social network has gained remarkable attention in the last decade. Accessing social network sites such as Twitter, Facebook LinkedIn and Google+ through the internet and the web 2.0 technologies has become more affordable. People are becoming more interested in and relying on social network for information, news and opinion of other users on diverse subject matters. The heavy reliance on social network sites causes them to generate massive data characterised by three computational issues namely; size, noise and dynamism. These issues often make social network data very complex to analyse manually, resulting in the pertinent use of computational means of analysing them. Data mining provides a wide range of techniques for detecting useful knowledge from massive datasets like trends, patterns and rules [44]. Data mining techniques are used for information retrieval, statistical modelling and machine learning. These techniques employ data pre-processing, data analysis, and data interpretation processes in the course of data analysis. This survey discusses different data mining techniques used in mining diverse aspects of the social network over decades going from the historical techniques to the up-to-date models, including our novel technique named TRCM. All the techniques covered in this survey are listed in the Table.1 including the tools employed as well as names of their authors.