14 resultados para Sentiment Analysis, Opinion Mining, Twitter

em Digital Commons at Florida International University


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the last decade, large numbers of social media services have emerged and been widely used in people's daily life as important information sharing and acquisition tools. With a substantial amount of user-contributed text data on social media, it becomes a necessity to develop methods and tools for text analysis for this emerging data, in order to better utilize it to deliver meaningful information to users. Previous work on text analytics in last several decades is mainly focused on traditional types of text like emails, news and academic literatures, and several critical issues to text data on social media have not been well explored: 1) how to detect sentiment from text on social media; 2) how to make use of social media's real-time nature; 3) how to address information overload for flexible information needs. In this dissertation, we focus on these three problems. First, to detect sentiment of text on social media, we propose a non-negative matrix tri-factorization (tri-NMF) based dual active supervision method to minimize human labeling efforts for the new type of data. Second, to make use of social media's real-time nature, we propose approaches to detect events from text streams on social media. Third, to address information overload for flexible information needs, we propose two summarization framework, dominating set based summarization framework and learning-to-rank based summarization framework. The dominating set based summarization framework can be applied for different types of summarization problems, while the learning-to-rank based summarization framework helps utilize the existing training data to guild the new summarization tasks. In addition, we integrate these techneques in an application study of event summarization for sports games as an example of how to better utilize social media data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the last decade, large numbers of social media services have emerged and been widely used in people's daily life as important information sharing and acquisition tools. With a substantial amount of user-contributed text data on social media, it becomes a necessity to develop methods and tools for text analysis for this emerging data, in order to better utilize it to deliver meaningful information to users. ^ Previous work on text analytics in last several decades is mainly focused on traditional types of text like emails, news and academic literatures, and several critical issues to text data on social media have not been well explored: 1) how to detect sentiment from text on social media; 2) how to make use of social media's real-time nature; 3) how to address information overload for flexible information needs. ^ In this dissertation, we focus on these three problems. First, to detect sentiment of text on social media, we propose a non-negative matrix tri-factorization (tri-NMF) based dual active supervision method to minimize human labeling efforts for the new type of data. Second, to make use of social media's real-time nature, we propose approaches to detect events from text streams on social media. Third, to address information overload for flexible information needs, we propose two summarization framework, dominating set based summarization framework and learning-to-rank based summarization framework. The dominating set based summarization framework can be applied for different types of summarization problems, while the learning-to-rank based summarization framework helps utilize the existing training data to guild the new summarization tasks. In addition, we integrate these techneques in an application study of event summarization for sports games as an example of how to better utilize social media data. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Research endeavors on spoken dialogue systems in the 1990s and 2000s have led to the deployment of commercial spoken dialogue systems (SDS) in microdomains such as customer service automation, reservation/booking and question answering systems. Recent research in SDS has been focused on the development of applications in different domains (e.g. virtual counseling, personal coaches, social companions) which requires more sophistication than the previous generation of commercial SDS. The focus of this research project is the delivery of behavior change interventions based on the brief intervention counseling style via spoken dialogue systems. ^ Brief interventions (BI) are evidence-based, short, well structured, one-on-one counseling sessions. Many challenges are involved in delivering BIs to people in need, such as finding the time to administer them in busy doctors' offices, obtaining the extra training that helps staff become comfortable providing these interventions, and managing the cost of delivering the interventions. Fortunately, recent developments in spoken dialogue systems make the development of systems that can deliver brief interventions possible. ^ The overall objective of this research is to develop a data-driven, adaptable dialogue system for brief interventions for problematic drinking behavior, based on reinforcement learning methods. The implications of this research project includes, but are not limited to, assessing the feasibility of delivering structured brief health interventions with a data-driven spoken dialogue system. Furthermore, while the experimental system focuses on harmful alcohol drinking as a target behavior in this project, the produced knowledge and experience may also lead to implementation of similarly structured health interventions and assessments other than the alcohol domain (e.g. obesity, drug use, lack of exercise), using statistical machine learning approaches. ^ In addition to designing a dialog system, the semantic and emotional meanings of user utterances have high impact on interaction. To perform domain specific reasoning and recognize concepts in user utterances, a named-entity recognizer and an ontology are designed and evaluated. To understand affective information conveyed through text, lexicons and sentiment analysis module are developed and tested.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Research endeavors on spoken dialogue systems in the 1990s and 2000s have led to the deployment of commercial spoken dialogue systems (SDS) in microdomains such as customer service automation, reservation/booking and question answering systems. Recent research in SDS has been focused on the development of applications in different domains (e.g. virtual counseling, personal coaches, social companions) which requires more sophistication than the previous generation of commercial SDS. The focus of this research project is the delivery of behavior change interventions based on the brief intervention counseling style via spoken dialogue systems. Brief interventions (BI) are evidence-based, short, well structured, one-on-one counseling sessions. Many challenges are involved in delivering BIs to people in need, such as finding the time to administer them in busy doctors' offices, obtaining the extra training that helps staff become comfortable providing these interventions, and managing the cost of delivering the interventions. Fortunately, recent developments in spoken dialogue systems make the development of systems that can deliver brief interventions possible. The overall objective of this research is to develop a data-driven, adaptable dialogue system for brief interventions for problematic drinking behavior, based on reinforcement learning methods. The implications of this research project includes, but are not limited to, assessing the feasibility of delivering structured brief health interventions with a data-driven spoken dialogue system. Furthermore, while the experimental system focuses on harmful alcohol drinking as a target behavior in this project, the produced knowledge and experience may also lead to implementation of similarly structured health interventions and assessments other than the alcohol domain (e.g. obesity, drug use, lack of exercise), using statistical machine learning approaches. In addition to designing a dialog system, the semantic and emotional meanings of user utterances have high impact on interaction. To perform domain specific reasoning and recognize concepts in user utterances, a named-entity recognizer and an ontology are designed and evaluated. To understand affective information conveyed through text, lexicons and sentiment analysis module are developed and tested.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study focuses on empirical investigations and seeks implications by utilizing three different methodologies to test various aspects of trader behavior. The first methodology utilizes Prospect Theory to determine trader behavior during periods of extreme wealth contracting periods. Secondly, a threshold model to examine the sentiment variable is formulated and thirdly a study is made of the contagion effect and trader behavior. ^ The connection between consumers' sense of financial well-being or sentiment and stock market performance has been studied at length. However, without data on actual versus experimental performance, implications based on this relationship are meaningless. The empirical agenda included examining a proprietary file of daily trader activities over a five-year period. Overall, during periods of extreme wealth altering conditions, traders "satisfice" rather than choose the "best" alternative. A trader's degree of loss aversion depends on his/her prior investment performance. A model that explains the behavior of traders during periods of turmoil is developed. Prospect Theory and the data file influenced the design of the model. ^ Additional research included testing a model that permitted the data to signal the crisis through a threshold model. The third empirical study sought to investigate the existence of contagion caused by declining global wealth effects using evidence from the mining industry in Canada. Contagion, where a financial crisis begins locally and subsequently spreads elsewhere, has been studied in terms of correlations among similar regions. The results provide support for Prospect Theory in two out of the three empirical studies. ^ The dissertation emphasizes the need for specifying precise, testable models of investors' expectations by providing tools to identify paradoxical behavior patterns. True enhancements in this field must include empirical research utilizing reliable data sources to mitigate data mining problems and allow researchers to distinguish between expectations-based and risk-based explanations of behavior. Through this type of research, it may be possible to systematically exploit "irrational" market behavior. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An Automatic Vehicle Location (AVL) system is a computer-based vehicle tracking system that is capable of determining a vehicle's location in real time. As a major technology of the Advanced Public Transportation System (APTS), AVL systems have been widely deployed by transit agencies for purposes such as real-time operation monitoring, computer-aided dispatching, and arrival time prediction. AVL systems make a large amount of transit performance data available that are valuable for transit performance management and planning purposes. However, the difficulties of extracting useful information from the huge spatial-temporal database have hindered off-line applications of the AVL data. ^ In this study, a data mining process, including data integration, cluster analysis, and multiple regression, is proposed. The AVL-generated data are first integrated into a Geographic Information System (GIS) platform. The model-based cluster method is employed to investigate the spatial and temporal patterns of transit travel speeds, which may be easily translated into travel time. The transit speed variations along the route segments are identified. Transit service periods such as morning peak, mid-day, afternoon peak, and evening periods are determined based on analyses of transit travel speed variations for different times of day. The seasonal patterns of transit performance are investigated by using the analysis of variance (ANOVA). Travel speed models based on the clustered time-of-day intervals are developed using important factors identified as having significant effects on speed for different time-of-day periods. ^ It has been found that transit performance varied from different seasons and different time-of-day periods. The geographic location of a transit route segment also plays a role in the variation of the transit performance. The results of this research indicate that advanced data mining techniques have good potential in providing automated techniques of assisting transit agencies in service planning, scheduling, and operations control. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The nation's freeway systems are becoming increasingly congested. A major contribution to traffic congestion on freeways is due to traffic incidents. Traffic incidents are non-recurring events such as accidents or stranded vehicles that cause a temporary roadway capacity reduction, and they can account for as much as 60 percent of all traffic congestion on freeways. One major freeway incident management strategy involves diverting traffic to avoid incident locations by relaying timely information through Intelligent Transportation Systems (ITS) devices such as dynamic message signs or real-time traveler information systems. The decision to divert traffic depends foremost on the expected duration of an incident, which is difficult to predict. In addition, the duration of an incident is affected by many contributing factors. Determining and understanding these factors can help the process of identifying and developing better strategies to reduce incident durations and alleviate traffic congestion. A number of research studies have attempted to develop models to predict incident durations, yet with limited success. ^ This dissertation research attempts to improve on this previous effort by applying data mining techniques to a comprehensive incident database maintained by the District 4 ITS Office of the Florida Department of Transportation (FDOT). Two categories of incident duration prediction models were developed: "offline" models designed for use in the performance evaluation of incident management programs, and "online" models for real-time prediction of incident duration to aid in the decision making of traffic diversion in the event of an ongoing incident. Multiple data mining analysis techniques were applied and evaluated in the research. The multiple linear regression analysis and decision tree based method were applied to develop the offline models, and the rule-based method and a tree algorithm called M5P were used to develop the online models. ^ The results show that the models in general can achieve high prediction accuracy within acceptable time intervals of the actual durations. The research also identifies some new contributing factors that have not been examined in past studies. As part of the research effort, software code was developed to implement the models in the existing software system of District 4 FDOT for actual applications. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

With advances in science and technology, computing and business intelligence (BI) systems are steadily becoming more complex with an increasing variety of heterogeneous software and hardware components. They are thus becoming progressively more difficult to monitor, manage and maintain. Traditional approaches to system management have largely relied on domain experts through a knowledge acquisition process that translates domain knowledge into operating rules and policies. It is widely acknowledged as a cumbersome, labor intensive, and error prone process, besides being difficult to keep up with the rapidly changing environments. In addition, many traditional business systems deliver primarily pre-defined historic metrics for a long-term strategic or mid-term tactical analysis, and lack the necessary flexibility to support evolving metrics or data collection for real-time operational analysis. There is thus a pressing need for automatic and efficient approaches to monitor and manage complex computing and BI systems. To realize the goal of autonomic management and enable self-management capabilities, we propose to mine system historical log data generated by computing and BI systems, and automatically extract actionable patterns from this data. This dissertation focuses on the development of different data mining techniques to extract actionable patterns from various types of log data in computing and BI systems. Four key problems—Log data categorization and event summarization, Leading indicator identification , Pattern prioritization by exploring the link structures , and Tensor model for three-way log data are studied. Case studies and comprehensive experiments on real application scenarios and datasets are conducted to show the effectiveness of our proposed approaches.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The most important factor that affects the decision making process in finance is the risk which is usually measured by variance (total risk) or systematic risk (beta). Since investors’ sentiment (whether she is an optimist or pessimist) plays a very important role in the choice of beta measure, any decision made for the same asset within the same time horizon will be different for different individuals. In other words, there will neither be homogeneity of beliefs nor the rational expectation prevalent in the market due to behavioral traits. This dissertation consists of three essays. In the first essay, “ Investor Sentiment and Intrinsic Stock Prices”, a new technical trading strategy was developed using a firm specific individual sentiment measure. This behavioral based trading strategy forecasts a range within which a stock price moves in a particular period and can be used for stock trading. Results indicate that sample firms trade within a range and give signals as to when to buy or sell. In the second essay, “Managerial Sentiment and the Value of the Firm”, examined the effect of managerial sentiment on the project selection process using net present value criterion and also effect of managerial sentiment on the value of firm. Final analysis reported that high sentiment and low sentiment managers obtain different values for the same firm before and after the acceptance of a project. Changes in the cost of capital, weighted cost of average capital were found due to managerial sentiment. In the last essay, “Investor Sentiment and Optimal Portfolio Selection”, analyzed how the investor sentiment affects the nature and composition of the optimal portfolio as well as the portfolio performance. Results suggested that the choice of the investor sentiment completely changes the portfolio composition, i.e., the high sentiment investor will have a completely different choice of assets in the portfolio in comparison with the low sentiment investor. The results indicated the practical application of behavioral model based technical indicator for stock trading. Additional insights developed include the valuation of firms with a behavioral component and the importance of distinguishing portfolio performance based on sentiment factors.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The United States has been increasingly concerned with the transnational threat posed by infectious diseases. Effective policy implementation to contain the spread of these diseases requires active engagement and support of the American public. To influence American public opinion and enlist support for related domestic and foreign policies, both domestic agencies and international organizations have framed infectious diseases as security threats, human rights disasters, economic risks, and as medical dangers. This study investigates whether American attitudes and opinions about infectious diseases are influenced by how the issue is framed. It also asks which issue frame has been most influential in shaping public opinion about global infectious diseases when people are exposed to multiple frames. The impact of media frames on public perception of infectious diseases is examined through content analysis of newspaper reports. Stories on SARS, avian flu, and HIV/AIDS were sampled from coverage in The New York Times and The Washington Post between 1999 and 2007. Surveys of public opinion on infectious diseases in the same time period were also drawn from databases like Health Poll Search and iPoll. Statistical analysis tests the relationship between media framing of diseases and changes in public opinion. Results indicate that no one frame was persuasive across all diseases. The economic frame had a significant effect on public opinion about SARS, as did the biomedical frame in the case of avian flu. Both the security and human rights frames affected opinion and increased public support for policies intended to prevent or treat HIV/AIDS. The findings also address the debate on the role and importance of domestic public opinion as a factor in domestic and foreign policy decisions of governments in an increasingly interconnected world. The public is able to make reasonable evaluations of the frames and the domestic and foreign policy issues emphasized in the frames.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The arrival of Cuba’s Information Technology (IT) and Communications Minister Ramiro Valdés to Venezuela in the Spring of 2010 to serve as a ‘consultant’ to the Venezuelan government awakened a new reality in that country. Rampant with deep economic troubles, escalating crime, a murder rate that has doubled since Chávez took over in 1999, and an opposition movement led by university students and other activists who use the Internet as their primary weapon, Venezuela has resorted to Cuba for help. In a country where in large part traditional media outlets have been censored or are government-controlled, the Internet and its online social networks have become the place to obtain, as well as disseminate, unfiltered information. As such, Internet growth and use of its social networks has skyrocketed in Venezuela, making it one of Latin America’s highest Web users. Because of its increased use to spark political debate among Venezuelans and publish information that differs with the official government line, Chávez has embarked on an initiative to bring the Internet to the poor and others who would otherwise not have access, by establishing government-sponsored Internet Info Centers throughout the country, to disseminate information to his followers. With the help of Cuban advisors, who for years have been a part of Venezuela’s defense, education, and health care initiatives, Chávez has apparently taken to adapting Cuba’s methodology for the control of information. He has begun to take special steps toward also controlling the type of information flowing through the country’s online social networks, considering the implementation of a government-controlled single Internet access point in Venezuela. Simultaneously, in adapting to Venezuela’s Internet reality, Chávez has engaged online by creating his own Twitter account in an attempt to influence public opinion, primarily of those who browse the Web. With a rapidly growing following that may soon reach one million subscribers, Chávez claims to have set up his own online trench to wage cyber space battle.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Many systems and applications are continuously producing events. These events are used to record the status of the system and trace the behaviors of the systems. By examining these events, system administrators can check the potential problems of these systems. If the temporal dynamics of the systems are further investigated, the underlying patterns can be discovered. The uncovered knowledge can be leveraged to predict the future system behaviors or to mitigate the potential risks of the systems. Moreover, the system administrators can utilize the temporal patterns to set up event management rules to make the system more intelligent. With the popularity of data mining techniques in recent years, these events grad- ually become more and more useful. Despite the recent advances of the data mining techniques, the application to system event mining is still in a rudimentary stage. Most of works are still focusing on episodes mining or frequent pattern discovering. These methods are unable to provide a brief yet comprehensible summary to reveal the valuable information from the high level perspective. Moreover, these methods provide little actionable knowledge to help the system administrators to better man- age the systems. To better make use of the recorded events, more practical techniques are required. From the perspective of data mining, three correlated directions are considered to be helpful for system management: (1) Provide concise yet comprehensive summaries about the running status of the systems; (2) Make the systems more intelligence and autonomous; (3) Effectively detect the abnormal behaviors of the systems. Due to the richness of the event logs, all these directions can be solved in the data-driven manner. And in this way, the robustness of the systems can be enhanced and the goal of autonomous management can be approached. This dissertation mainly focuses on the foregoing directions that leverage tem- poral mining techniques to facilitate system management. More specifically, three concrete topics will be discussed, including event, resource demand prediction, and streaming anomaly detection. Besides the theoretic contributions, the experimental evaluation will also be presented to demonstrate the effectiveness and efficacy of the corresponding solutions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study focuses on empirical investigations and seeks implications by utilizing three different methodologies to test various aspects of trader behavior. The first methodology utilizes Prospect Theory to determine trader behavior during periods of extreme wealth contracting periods. Secondly, a threshold model to examine the sentiment variable is formulated and thirdly a study is made of the contagion effect and trader behavior. The connection between consumers' sense of financial well-being or sentiment and stock market performance has been studied at length. However, without data on actual versus experimental performance, implications based on this relationship are meaningless. The empirical agenda included examining a proprietary file of daily trader activities over a five-year period. Overall, during periods of extreme wealth altering conditions, traders "satisfice" rather than choose the "best" alternative. A trader's degree of loss aversion depends on his/her prior investment performance. A model that explains the behavior of traders during periods of turmoil is developed. Prospect Theory and the data file influenced the design of the model. Additional research included testing a model that permitted the data to signal the crisis through a threshold model. The third empirical study sought to investigate the existence of contagion caused by declining global wealth effects using evidence from the mining industry in Canada. Contagion, where a financial crisis begins locally and subsequently spreads elsewhere, has been studied in terms of correlations among similar regions. The results provide support for Prospect Theory in two out of the three empirical studies. The dissertation emphasizes the need for specifying precise, testable models of investors' expectations by providing tools to identify paradoxical behavior patterns. True enhancements in this field must include empirical research utilizing reliable data sources to mitigate data mining problems and allow researchers to distinguish between expectations-based and risk-based explanations of behavior. Through this type of research, it may be possible to systematically exploit "irrational" market behavior.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Thanks to the advanced technologies and social networks that allow the data to be widely shared among the Internet, there is an explosion of pervasive multimedia data, generating high demands of multimedia services and applications in various areas for people to easily access and manage multimedia data. Towards such demands, multimedia big data analysis has become an emerging hot topic in both industry and academia, which ranges from basic infrastructure, management, search, and mining to security, privacy, and applications. Within the scope of this dissertation, a multimedia big data analysis framework is proposed for semantic information management and retrieval with a focus on rare event detection in videos. The proposed framework is able to explore hidden semantic feature groups in multimedia data and incorporate temporal semantics, especially for video event detection. First, a hierarchical semantic data representation is presented to alleviate the semantic gap issue, and the Hidden Coherent Feature Group (HCFG) analysis method is proposed to capture the correlation between features and separate the original feature set into semantic groups, seamlessly integrating multimedia data in multiple modalities. Next, an Importance Factor based Temporal Multiple Correspondence Analysis (i.e., IF-TMCA) approach is presented for effective event detection. Specifically, the HCFG algorithm is integrated with the Hierarchical Information Gain Analysis (HIGA) method to generate the Importance Factor (IF) for producing the initial detection results. Then, the TMCA algorithm is proposed to efficiently incorporate temporal semantics for re-ranking and improving the final performance. At last, a sampling-based ensemble learning mechanism is applied to further accommodate the imbalanced datasets. In addition to the multimedia semantic representation and class imbalance problems, lack of organization is another critical issue for multimedia big data analysis. In this framework, an affinity propagation-based summarization method is also proposed to transform the unorganized data into a better structure with clean and well-organized information. The whole framework has been thoroughly evaluated across multiple domains, such as soccer goal event detection and disaster information management.