2 resultados para Event Log Mining
em Universidad Politécnica de Madrid
Resumo:
Abstract Due to recent scientific and technological advances in information sys¬tems, it is now possible to perform almost every application on a mobile device. The need to make sense of such devices more intelligent opens an opportunity to design data mining algorithm that are able to autonomous execute in local devices to provide the device with knowledge. The problem behind autonomous mining deals with the proper configuration of the algorithm to produce the most appropriate results. Contextual information together with resource information of the device have a strong impact on both the feasibility of a particu¬lar execution and on the production of the proper patterns. On the other hand, performance of the algorithm expressed in terms of efficacy and efficiency highly depends on the features of the dataset to be analyzed together with values of the parameters of a particular implementation of an algorithm. However, few existing approaches deal with autonomous configuration of data mining algorithms and in any case they do not deal with contextual or resources information. Both issues are of particular significance, in particular for social net¬works application. In fact, the widespread use of social networks and consequently the amount of information shared have made the need of modeling context in social application a priority. Also the resource consumption has a crucial role in such platforms as the users are using social networks mainly on their mobile devices. This PhD thesis addresses the aforementioned open issues, focusing on i) Analyzing the behavior of algorithms, ii) mapping contextual and resources information to find the most appropriate configuration iii) applying the model for the case of a social recommender. Four main contributions are presented: - The EE-Model: is able to predict the behavior of a data mining algorithm in terms of resource consumed and accuracy of the mining model it will obtain. - The SC-Mapper: maps a situation defined by the context and resource state to a data mining configuration. - SOMAR: is a social activity (event and informal ongoings) recommender for mobile devices. - D-SOMAR: is an evolution of SOMAR which incorporates the configurator in order to provide updated recommendations. Finally, the experimental validation of the proposed contributions using synthetic and real datasets allows us to achieve the objectives and answer the research questions proposed for this dissertation.
Resumo:
This paper presents an approach to compare two types of data, subjective data (Polarity of Pan American Games 2011 event by country) and objective data (the number of medals won by each participating country), based on the Pearson corre- lation. When dealing with events described by people, knowledge acquisition is difficult because their structure is heterogeneous and subjective. A first step towards knowing the polarity of the information provided by people consists in automatically classifying the posts into clusters according to their polarity. The authors carried out a set of experiments using a corpus that consists of 5600 posts extracted from 168 Internet resources related to a specific event: the 2011 Pan American games. The approach is based on four components: a crawler, a filter, a synthesizer and a polarity analyzer. The PanAmerican approach automatically classifies the polarity of the event into clusters with the following results: 588 positive, 336 neutral, and 76 negative. Our work found out that the polarity of the content produced was strongly influenced by the results of the event with a correlation of .74. Thus, it is possible to conclude that the polarity of content is strongly affected by the results of the event. Finally, the accuracy of the PanAmerican approach is: .87, .90, and .80 according to the precision of the three classes of polarity evaluated.