34 resultados para Sentiment Analysis Opinion Mining Text Mining Twitter


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In product reviews, it is observed that the distribution of polarity ratings over reviews written by different users or evaluated based on different products are often skewed in the real world. As such, incorporating user and product information would be helpful for the task of sentiment classification of reviews. However, existing approaches ignored the temporal nature of reviews posted by the same user or evaluated on the same product. We argue that the temporal relations of reviews might be potentially useful for learning user and product embedding and thus propose employing a sequence model to embed these temporal relations into user and product representations so as to improve the performance of document-level sentiment analysis. Specifically, we first learn a distributed representation of each review by a one-dimensional convolutional neural network. Then, taking these representations as pretrained vectors, we use a recurrent neural network with gated recurrent units to learn distributed representations of users and products. Finally, we feed the user, product and review representations into a machine learning classifier for sentiment classification. Our approach has been evaluated on three large-scale review datasets from the IMDB and Yelp. Experimental results show that: (1) sequence modeling for the purposes of distributed user and product representation learning can improve the performance of document-level sentiment classification; (2) the proposed approach achieves state-of-The-Art results on these benchmark datasets.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Different types of sentences express sentiment in very different ways. Traditional sentence-level sentiment classification research focuses on one-technique-fits-all solution or only centers on one special type of sentences. In this paper, we propose a divide-and-conquer approach which first classifies sentences into different types, then performs sentiment analysis separately on sentences from each type. Specifically, we find that sentences tend to be more complex if they contain more sentiment targets. Thus, we propose to first apply a neural network based sequence model to classify opinionated sentences into three types according to the number of targets appeared in a sentence. Each group of sentences is then fed into a one-dimensional convolutional neural network separately for sentiment classification. Our approach has been evaluated on four sentiment classification datasets and compared with a wide range of baselines. Experimental results show that: (1) sentence type classification can improve the performance of sentence-level sentiment analysis; (2) the proposed approach achieves state-of-the-art results on several benchmarking datasets.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Sentiment analysis has long focused on binary classification of text as either positive or negative. There has been few work on mapping sentiments or emotions into multiple dimensions. This paper studies a Bayesian modeling approach to multi-class sentiment classification and multidimensional sentiment distributions prediction. It proposes effective mechanisms to incorporate supervised information such as labeled feature constraints and document-level sentiment distributions derived from the training data into model learning. We have evaluated our approach on the datasets collected from the confession section of the Experience Project website where people share their life experiences and personal stories. Our results show that using the latent representation of the training documents derived from our approach as features to build a maximum entropy classifier outperforms other approaches on multi-class sentiment classification. In the more difficult task of multi-dimensional sentiment distributions prediction, our approach gives superior performance compared to a few competitive baselines. © 2012 ACM.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With the development of social media tools such as Facebook and Twitter, mainstream media organizations including newspapers and TV media have played an active role in engaging with their audience and strengthening their influence on the recently emerged platforms. In this paper, we analyze the behavior of mainstream media on Twitter and study how they exert their influence to shape public opinion during the UK's 2010 General Election. We first propose an empirical measure to quantify mainstream media bias based on sentiment analysis and show that it correlates better with the actual political bias in the UK media than the pure quantitative measures based on media coverage of various political parties. We then compare the information diffusion patterns from different categories of sources. We found that while mainstream media is good at seeding prominent information cascades, its role in shaping public opinion is being challenged by journalists since tweets from them are more likely to be retweeted and they spread faster and have longer lifespan compared to tweets from mainstream media. Moreover, the political bias of the journalists is a good indicator of the actual election results. Copyright 2013 ACM.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

he push to widen participation in public consultation suggests social media as an additional mechanism through which to engage the public. Bioenergy companies need to build their capacity to communicate in these new media and to monitor the attitudes of the public and opposition organisations towards energy development projects. Design/methodology/approach This short paper outlines the planning issues bioenergy developments face and the main methods of communication used in the public consultation process in the UK. The potential role of social media in communication with stakeholders is identified. The capacity of sentiment analysis to mine opinions from social media is summarised, and illustrated using a sample of tweets containing the term ‘bioenergy’ Findings Social media have the potential to improve information flows between stakeholders and developers. Sentiment analysis is a viable Purpose The push to widen participation in public consultation suggests social media as an additional mechanism through which to engage the public. Bioenergy companies need to build their capacity to communicate in these new media and to monitor the attitudes of the public and opposition organisations towards energy development projects. Design/methodology/approach This short paper outlines the planning issues bioenergy developments face and the main methods of communication used in the public consultation process in the UK. The potential role of social media in communication with stakeholders is identified. The capacity of sentiment analysis to mine opinions from social media is summarised, and illustrated using a sample of tweets containing the term ‘bioenergy’ Findings Social media have the potential to improve information flows between stakeholders and developers. Sentiment analysis is a viable methodology, which bioenergy companies should be using to measure public opinion in the consultation process. Preliminary analysis shows promising results. Research limitations/implications Analysis is preliminary and based on a small dataset. It is intended only to illustrate the potential of sentiment analysis and not to draw general conclusions about the bioenergy sector. Originality/value Opinion mining, though established in marketing and political analysis, is not yet systematically applied as a planning consultation tool. This is a missed opportunity.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this poster we presented our preliminary work on the study of spammer detection and analysis with 50 active honeypot profiles implemented on Weibo.com and QQ.com microblogging networks. We picked out spammers from legitimate users by manually checking every captured user's microblogs content. We built a spammer dataset for each social network community using these spammer accounts and a legitimate user dataset as well. We analyzed several features of the two user classes and made a comparison on these features, which were found to be useful to distinguish spammers from legitimate users. The followings are several initial observations from our analysis on the features of spammers captured on Weibo.com and QQ.com. ¦The following/follower ratio of spammers is usually higher than legitimate users. They tend to follow a large amount of users in order to gain popularity but always have relatively few followers. ¦There exists a big gap between the average numbers of microblogs posted per day from these two classes. On Weibo.com, spammers post quite a lot microblogs every day, which is much more than legitimate users do; while on QQ.com spammers post far less microblogs than legitimate users. This is mainly due to the different strategies taken by spammers on these two platforms. ¦More spammers choose a cautious spam posting pattern. They mix spam microblogs with ordinary ones so that they can avoid the anti-spam mechanisms taken by the service providers. ¦Aggressive spammers are more likely to be detected so they tend to have a shorter life while cautious spammers can live much longer and have a deeper influence on the network. The latter kind of spammers may become the trend of social network spammer. © 2012 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We introduce a type of 2-tier convolutional neural network model for learning distributed paragraph representations for a special task (e.g. paragraph or short document level sentiment analysis and text topic categorization). We decompose the paragraph semantics into 3 cascaded constitutes: word representation, sentence composition and document composition. Specifically, we learn distributed word representations by a continuous bag-of-words model from a large unstructured text corpus. Then, using these word representations as pre-trained vectors, distributed task specific sentence representations are learned from a sentence level corpus with task-specific labels by the first tier of our model. Using these sentence representations as distributed paragraph representation vectors, distributed paragraph representations are learned from a paragraph-level corpus by the second tier of our model. It is evaluated on DBpedia ontology classification dataset and Amazon review dataset. Empirical results show the effectiveness of our proposed learning model for generating distributed paragraph representations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Purpose - The purpose of this paper is to assess high-dimensional visualisation, combined with pattern matching, as an approach to observing dynamic changes in the ways people tweet about science topics. Design/methodology/approach - The high-dimensional visualisation approach was applied to three scientific topics to test its effectiveness for longitudinal analysis of message framing on Twitter over two disjoint periods in time. The paper uses coding frames to drive categorisation and visual analytics of tweets discussing the science topics. Findings - The findings point to the potential of this mixed methods approach, as it allows sufficiently high sensitivity to recognise and support the analysis of non-trending as well as trending topics on Twitter. Research limitations/implications - Three topics are studied and these illustrate a range of frames, but results may not be representative of all scientific topics. Social implications - Funding bodies increasingly encourage scientists to participate in public engagement. As social media provides an avenue actively utilised for public communication, understanding the nature of the dialog on this medium is important for the scientific community and the public at large. Originality/value - This study differs from standard approaches to the analysis of microblog data, which tend to focus on machine driven analysis large-scale datasets. It provides evidence that this approach enables practical and effective analysis of the content of midsize to large collections of microposts.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Existing approaches of social influence analysis usually focus on how to develop effective algorithms to quantize users' influence scores. They rarely consider a person's expertise levels which are arguably important to influence measures. In this paper, we propose a computational approach to measuring the correlation between expertise and social media influence, and we take a new perspective to understand social media influence by incorporating expertise into influence analysis. We carefully constructed a large dataset of 13,684 Chinese celebrities from Sina Weibo (literally 'Sina microblogging'). We found that there is a strong correlation between expertise levels and social media influence scores. In addition, different expertise levels showed influence variation patterns: high-expertise celebrities have stronger influence on the 'audience' in their expertise domains.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Intersubjectivity is an important concept in psychology and sociology. It refers to sharing conceptualizations through social interactions in a community and using such shared conceptualization as a resource to interpret things that happen in everyday life. In this work, we make use of intersubjectivity as the basis to model shared stance and subjectivity for sentiment analysis. We construct an intersubjectivity network which links review writers, terms they used, as well as the polarities of the terms. Based on this network model, we propose a method to learn writer embeddings which are subsequently incorporated into a convolutional neural network for sentiment analysis. Evaluations on the IMDB, Yelp 2013 and Yelp 2014 datasets show that the proposed approach has achieved the state-of-the-art performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Since the transfer of a message between two cultures very frequently takes place through the medium of a written text qua communicative event, it would seem useful to attempt to ascertain whether there is any kind of pattern in the use of strategies for the effective interlingual transfer of this message. Awareness of potentially successful strategies, within the constraints of context, text type, intended TL function and TL reader profile will enhance quality and cost-effectiveness (time, effort, financial costs) in the production of the target text. Through contrastive analysis of pairs of advertising texts, SL and TL, French and English, this study will attempt to identify the nature of some recurring choices made by different translators in the attempt to recreate ST information in the TL in such a manner as to reproduce as closely as possible the informative, persuasive and affective functions of the text as advertising material. Whilst recurrence may be seen to be significant in terms of illustrating tendencies with regard to the solution of problems of translation, this would not necessarily be taken as confirmation of the existence of pre-determined or prescriptive rules. These tendencies could, however, be taken as a guide to potential solutions to certain kinds of context-bound and text-type specific problem. Analysis of translated text-pairs taken from the field of advertising should produce examples of constraints posed by the need to select the content, tone and form of the Target Text, in order to ensure maximum efficacy of persuasive effect and to ensure the desired outcome, as determined by the Source Text function. When evaluating the success of a translated advertising text, constraints could be defined in terms of the culture-specific references or assumptions on which a Source Text may build in order to achieve its intended communicative function within the target community.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

To date, more than 16 million citations of published articles in biomedical domain are available in the MEDLINE database. These articles describe the new discoveries which accompany a tremendous development in biomedicine during the last decade. It is crucial for biomedical researchers to retrieve and mine some specific knowledge from the huge quantity of published articles with high efficiency. Researchers have been engaged in the development of text mining tools to find knowledge such as protein-protein interactions, which are most relevant and useful for specific analysis tasks. This chapter provides a road map to the various information extraction methods in biomedical domain, such as protein name recognition and discovery of protein-protein interactions. Disciplines involved in analyzing and processing unstructured-text are summarized. Current work in biomedical information extracting is categorized. Challenges in the field are also presented and possible solutions are discussed.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

This thesis is concerned with certain aspects of the Public Inquiry into the accident at Houghton Main Colliery in June 1975. It examines whether prior to the accident there existed at the Colliery a situation in which too much reliance was being placed upon state regulation and too Iittle upon personal responsibility. I study the phenomenon of state regulation. This is done (a) by analysis of selected writings on state regulation/intervention/interference/bureaucracy (the words are used synonymously) over the last two hundred years, specifically those of Marx on the 1866 Committee on Mines, and (b) by studying Chadwick and Tremenheere, leading and contrasting "bureaucrats" of the mid-nineteenth century. The bureaucratisation of the mining industry over the period 1835-1954 is described, and it is demonstrated that the industry obtained and now possesses those characteristics outlined by Max Weber in his model of bureaucracy. I analyse criticisms of the model and find them to be relevant, in that they facilitate understanding both of the circumstances of the accident and of the Inquiry . Further understanding of the circumstances and causes of the accident was gained by attendance at the lnquiry and by interviewing many of those involved in the Inquiry. I analyse many aspects of the Inquiry - its objectives. structure, procedure and conflicting interests - and find that, although the Inquiry had many of the symbols of bureaucracy, it suffered not from " too much" outside interference. but rather from the coal mining industry's shared belief in its ability to solve its own problems. I found nothing to suggest that, prior to the accident, colliery personnel relied. or were encouraged to rely, "too much" upon state regulation.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

A study of information available on the settlement characteristics of backfill in restored opencast coal mining sites and other similar earthworks projects has been undertaken. In addition, the methods of opencast mining, compaction controls, monitoring and test methods have been reviewed. To consider and develop the methods of predicting the settlement of fill, three sites in the West Midlands have been examined; at each, the backfill had been placed in a controlled manner. In addition, use has been made of a finite element computer program to compare a simple two-dimensional linear elastic analysis with field observations of surface settlements in the vicinity of buried highwalls. On controlled backfill sites, settlement predictions have been accurately made, based on a linear relationship between settlement (expressed as a percentage of fill height) against logarithm of time. This `creep' settlement was found to be effectively complete within 18 months of restoration. A decrease of this percentage settlement was observed with increasing fill thickness; this is believed to be related to the speed with which the backfill is placed. A rising water table within the backfill is indicated to cause additional gradual settlement. A prediction method, based on settlement monitoring, has been developed and used to determine the pattern of settlement across highwalls and buried highwalls. The zone of appreciable differential settlement was found to be mainly limited to the highwall area, the magnitude was dictated by the highwall inclination. With a backfill cover of about 15 metres over a buried highwall the magnitude of differential settlement was negligible. Use has been made of the proposed settlement prediction method and monitoring to control the re-development of restored opencase sites. The specifications, tests and monitoring techniques developed in recent years have been used to aid this. Such techniques have been valuable in restoring land previously derelict due to past underground mining.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

In this paper we present the design and analysis of an intonation model for text-to-speech (TTS) synthesis applications using a combination of Relational Tree (RT) and Fuzzy Logic (FL) technologies. The model is demonstrated using the Standard Yorùbá (SY) language. In the proposed intonation model, phonological information extracted from text is converted into an RT. RT is a sophisticated data structure that represents the peaks and valleys as well as the spatial structure of a waveform symbolically in the form of trees. An initial approximation to the RT, called Skeletal Tree (ST), is first generated algorithmically. The exact numerical values of the peaks and valleys on the ST is then computed using FL. Quantitative analysis of the result gives RMSE of 0.56 and 0.71 for peak and valley respectively. Mean Opinion Scores (MOS) of 9.5 and 6.8, on a scale of 1 - -10, was obtained for intelligibility and naturalness respectively.