49 resultados para Sentiment Analysis, Opinion Mining, Twitter
em CentAUR: Central Archive University of Reading - UK
Resumo:
Aircraft Maintenance, Repair and Overhaul (MRO) feedback commonly includes an engineer’s complex text-based inspection report. Capturing and normalizing the content of these textual descriptions is vital to cost and quality benchmarking, and provides information to facilitate continuous improvement of MRO process and analytics. As data analysis and mining tools requires highly normalized data, raw textual data is inadequate. This paper offers a textual-mining solution to efficiently analyse bulk textual feedback data. Despite replacement of the same parts and/or sub-parts, the actual service cost for the same repair is often distinctly different from similar previously jobs. Regular expression algorithms were incorporated with an aircraft MRO glossary dictionary in order to help provide additional information concerning the reason for cost variation. Professional terms and conventions were included within the dictionary to avoid ambiguity and improve the outcome of the result. Testing results show that most descriptive inspection reports can be appropriately interpreted, allowing extraction of highly normalized data. This additional normalized data strongly supports data analysis and data mining, whilst also increasing the accuracy of future quotation costing. This solution has been effectively used by a large aircraft MRO agency with positive results.
Resumo:
The Twitter network has been labelled the most commonly used microblogging application around today. With about 500 million estimated registered users as of June, 2012, Twitter has become a credible medium of sentiment/opinion expression. It is also a notable medium for information dissemination; including breaking news on diverse issues since it was launched in 2007. Many organisations, individuals and even government bodies follow activities on the network in order to obtain knowledge on how their audience reacts to tweets that affect them. We can use postings on Twitter (known as tweets) to analyse patterns associated with events by detecting the dynamics of the tweets. A common way of labelling a tweet is by including a number of hashtags that describe its contents. Association Rule Mining can find the likelihood of co-occurrence of hashtags. In this paper, we propose the use of temporal Association Rule Mining to detect rule dynamics, and consequently dynamics of tweets. We coined our methodology Transaction-based Rule Change Mining (TRCM). A number of patterns are identifiable in these rule dynamics including, new rules, emerging rules, unexpected rules and ?dead' rules. Also the linkage between the different types of rule dynamics is investigated experimentally in this paper.
Resumo:
Social network has gained remarkable attention in the last decade. Accessing social network sites such as Twitter, Facebook LinkedIn and Google+ through the internet and the web 2.0 technologies has become more affordable. People are becoming more interested in and relying on social network for information, news and opinion of other users on diverse subject matters. The heavy reliance on social network sites causes them to generate massive data characterised by three computational issues namely; size, noise and dynamism. These issues often make social network data very complex to analyse manually, resulting in the pertinent use of computational means of analysing them. Data mining provides a wide range of techniques for detecting useful knowledge from massive datasets like trends, patterns and rules [44]. Data mining techniques are used for information retrieval, statistical modelling and machine learning. These techniques employ data pre-processing, data analysis, and data interpretation processes in the course of data analysis. This survey discusses different data mining techniques used in mining diverse aspects of the social network over decades going from the historical techniques to the up-to-date models, including our novel technique named TRCM. All the techniques covered in this survey are listed in the Table.1 including the tools employed as well as names of their authors.
Resumo:
This paper considers the use of Association Rule Mining (ARM) and our proposed Transaction based Rule Change Mining (TRCM) to identify the rule types present in tweet’s hashtags over a specific consecutive period of time and their linkage to real life occurrences. Our novel algorithm was termed TRCM-RTI in reference to Rule Type Identification. We created Time Frame Windows (TFWs) to detect evolvement statuses and calculate the lifespan of hashtags in online tweets. We link RTI to real life events by monitoring and recording rule evolvement patterns in TFWs on the Twitter network.
Resumo:
This paper examines the dynamics of the ongoing conflict in Prestea, Ghana, where indigenous galamsey mining groups are operating illegally on a concession awarded to Bogoso Gold Limited (BGL), property of the Canadian-listed multinational Gold Star Resources. Despite being issued firm orders by the authorities to abandon their activities, galamsey leaders maintain that they are working areas of the concession that are of little interest to the company; they further counter that there are few alternative sources of local employment, which is why they are mining in the first place. Whilst the Ghanaian Government is in the process of setting aside plots to relocate illegal mining parties and is developing alternative livelihood projects, efforts are far from encouraging: in addition to a series of overlooked logistical problems, the areas earmarked for relocation have not yet been prospected to ascertain gold content, and the alternative income-earning activities identified are inappropriate. As has been the case throughout mineral-rich sub-Saharan Africa, the conflict in Prestea has come about largely because the national mining sector reform program, which prioritizes the expansion of predominantly foreign-controlled large-scale projects, has neglected the concerns of indigenous subsistence groups.
Resumo:
Background: Since their inception, Twitter and related microblogging systems have provided a rich source of information for researchers and have attracted interest in their affordances and use. Since 2009 PubMed has included 123 journal articles on medicine and Twitter, but no overview exists as to how the field uses Twitter in research. // Objective: This paper aims to identify published work relating to Twitter indexed by PubMed, and then to classify it. This classification will provide a framework in which future researchers will be able to position their work, and to provide an understanding of the current reach of research using Twitter in medical disciplines. Limiting the study to papers indexed by PubMed ensures the work provides a reproducible benchmark. // Methods: Papers, indexed by PubMed, on Twitter and related topics were identified and reviewed. The papers were then qualitatively classified based on the paper’s title and abstract to determine their focus. The work that was Twitter focused was studied in detail to determine what data, if any, it was based on, and from this a categorization of the data set size used in the studies was developed. Using open coded content analysis additional important categories were also identified, relating to the primary methodology, domain and aspect. // Results: As of 2012, PubMed comprises more than 21 million citations from biomedical literature, and from these a corpus of 134 potentially Twitter related papers were identified, eleven of which were subsequently found not to be relevant. There were no papers prior to 2009 relating to microblogging, a term first used in 2006. Of the remaining 123 papers which mentioned Twitter, thirty were focussed on Twitter (the others referring to it tangentially). The early Twitter focussed papers introduced the topic and highlighted the potential, not carrying out any form of data analysis. The majority of published papers used analytic techniques to sort through thousands, if not millions, of individual tweets, often depending on automated tools to do so. Our analysis demonstrates that researchers are starting to use knowledge discovery methods and data mining techniques to understand vast quantities of tweets: the study of Twitter is becoming quantitative research. // Conclusions: This work is to the best of our knowledge the first overview study of medical related research based on Twitter and related microblogging. We have used five dimensions to categorise published medical related research on Twitter. This classification provides a framework within which researchers studying development and use of Twitter within medical related research, and those undertaking comparative studies of research relating to Twitter in the area of medicine and beyond, can position and ground their work.
Resumo:
The General Election for the 56th United Kingdom Parliament was held on 7 May 2015. Tweets related to UK politics, not only those with the specific hashtag ”#GE2015”, have been collected in the period between March 1 and May 31, 2015. The resulting dataset contains over 28 million tweets for a total of 118 GB in uncompressed format or 15 GB in compressed format. This study describes the method that was used to collect the tweets and presents some analysis, including a political sentiment index, and outlines interesting research directions on Big Social Data based on Twitter microblogging.
Resumo:
We present a method to enhance fault localization for software systems based on a frequent pattern mining algorithm. Our method is based on a large set of test cases for a given set of programs in which faults can be detected. The test executions are recorded as function call trees. Based on test oracles the tests can be classified into successful and failing tests. A frequent pattern mining algorithm is used to identify frequent subtrees in successful and failing test executions. This information is used to rank functions according to their likelihood of containing a fault. The ranking suggests an order in which to examine the functions during fault analysis. We validate our approach experimentally using a subset of Siemens benchmark programs.
Resumo:
This paper provides an extended analysis of the tensions that have surfaced between large-scale mine operators and artisanal miners in gold-rich areas of rural Tanzania. The literature on grievance is used to contextualise, these disputes, the underlying cause of which is artisanal miners' mounting frustration over not being able to secure viable concessions to work. Newly implemented legislation has, for the most part, empowered foreign large-scale mine operators, while simultaneously disempowering indigenous small-scale miners. In many cases, the former have addressed mounting security and community problems on their own. Until the country's major mine operators extend assistance to marginalised small-scale mining groups, the likelihood of violent conflict unfolding between these parties will increase.
Resumo:
This paper critiques contemporary research and policy approaches taken toward the analysis and abatement of mercury pollution in the small-scale gold mining sector. Unmonitored releases of mercury from gold amalgamation have caused considerable environmental contamination and human health complications in rural reaches of sub-Saharan Africa, Latin America and Asia. Whilst these problems have caught the attention of the scientific community over the past 15-20 years, the research that has since been undertaken has failed to identify appropriate mitigation measures, and has done little to advance understanding of why contamination persists. Moreover, the strategies used to educate operators about the impacts of acute mercury exposure, and the technologies implemented to prevent farther pollution, have been marginally effective at best. The mercury pollution problem will not be resolved until governments and donor agencies commit to carrying out research aimed at improving understanding of the dynamics of small scale gold mining communities. Acquisition of this knowledge is the key to designing and implementing appropriate support and abatement measures. (c) 2005 Elsevier B.V. All rights reserved.
Resumo:
This paper contributes to a growing body of literature that critically examines how mining companies are embracing community development challenges in developing countries, drawing on experiences from Ghana. Despite receiving considerable praise from the donor and industry communities, the actions being taken by Ghana's major mining companies to foster community development are facilitating few improvements in the rural regions where activities take place. Companies are generally implementing community development programmes that are incapable of alleviating rural hardship and are coordinating destructive displacement exercises. The analysis serves as a stark reminder that mining companies are not charities and engage with African countries strictly for commercial purposes.
Resumo:
This paper provides an extended analysis of the child labor problem in the artisanal and small-scale mining (ASM) sector, focusing specifically on the situation in sub-Saharan Africa. In recent years, the issue of child labor in ASM has garnered significant attention from the International Labor Organization (ILO), which has been particularly active in raising public awareness of the problem; and, has proceeded to implement policies and collaborative project work aimed at Curtailing children's participation in ASM activities in a number of African countries. The analysis concludes with a critical appraisal of an ILO project recently launched in the Talensi-Nabdam District in the Upper East Region of Ghana, which sheds light on how the child labor problem is being tackled in practice in ASM communities in sub-Saharan Africa. (c) 2008 Elsevier Ltd. All rights reserved.