10 resultados para information decomposition

em Deakin Research Online - Australia


Relevância:

60.00% 60.00%

Publicador:

Resumo:

The severe class distribution shews the presence of underrepresented data, which has great effects on the performance of learning algorithm, is still a challenge of data mining and machine learning. Lots of researches currently focus on experimental comparison of the existing re-sampling approaches. We believe it requires new ways of constructing better algorithms to further balance and analyse the data set. This paper presents a Fuzzy-based Information Decomposition oversampling (FIDoS) algorithm used for handling the imbalanced data. Generally speaking, this is a new way of addressing imbalanced learning problems from missing data perspective. First, we assume that there are missing instances in the minority class that result in the imbalanced dataset. Then the proposed algorithm which takes advantages of fuzzy membership function is used to transfer information to the missing minority class instances. Finally, the experimental results demonstrate that the proposed algorithm is more practical and applicable compared to sampling techniques.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Spam has become a critical problem in online social networks. This paper focuses on Twitter spam detection. Recent research works focus on applying machine learning techniques for Twitter spam detection, which make use of the statistical features of tweets. We observe existing machine learning based detection methods suffer from the problem of Twitter spam drift, i.e., the statistical properties of spam tweets vary over time. To avoid this problem, an effective solution is to train one twitter spam classifier every day. However, it faces a challenge of the small number of imbalanced training data because labelling spam samples is time-consuming. This paper proposes a new method to address this challenge. The new method employs two new techniques, fuzzy-based redistribution and asymmetric sampling. We develop a fuzzy-based information decomposition technique to re-distribute the spam class and generate more spam samples. Moreover, an asymmetric sampling technique is proposed to re-balance the sizes of spam samples and non-spam samples in the training data. Finally, we apply the ensemble technique to combine the spam classifiers over two different training sets. A number of experiments are performed on a real-world 10-day ground-truth dataset to evaluate the new method. Experiments results show that the new method can significantly improve the detection performance for drifting Twitter spam.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents two hyperlink analysis-based algorithms to find relevant pages for a given Web page (URL). The first algorithm comes from the extended cocitation analysis of the Web pages. It is intuitive and easy to implement. The second one takes advantage of linear algebra theories to reveal deeper relationships among the Web pages and to identify relevant pages more precisely and effectively. The experimental results show the feasibility and effectiveness of the algorithms. These algorithms could be used for various Web applications, such as enhancing Web search. The ideas and techniques in this work would be helpful to other Web-related researches.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Purpose – This paper develops a new decomposition method of the housing market variations to analyse the housing dynamics of the Australian eight capital cities.
Design/methodology/approach – This study reviews the prior research on analysing the housing market variations and classifies the previous methods into four main models. Based on this, the study develops a new decomposition of the variations, which is made up of regional information, homemarket information and time information. The panel data regression method, unit root test and F test are adopted to construct the model and interpret the housing market variations of the Australian capital cities.
Findings – This paper suggests that the Australian home-market information has the same elasticity to the housing market variations across cities and time. In contrast, the elasticities of the regional information are distinguished. However, similarities exit in the west and north of Australia or the south and east of Australia. The time information contributes differently along the observing period, although the similarities are found in certain periods.
Originality/value – This paper introduces the housing market variation decomposition into the research of housing market variations and develops a model based on the new method of the housing market variation decomposition.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The key nodes in network play the critical role in system recovery and survival. Many traditional key nodes selection algorithms utilize the characters of the physical topology to find the key nodes. But they can hardly succeed in the mobile ad hoc network due to the mobility nature of the network. In this paper we propose a social-aware Kcore selection algorithm to work in the Pocket Switched Network. The social view of the network suggests the social position of the mobile nodes can help to find the key nodes in the Pocket Switched Network. The S-Kcore selection algorithm is designed to exploit the nodes' social features to improve the performance in data communication. Experiments use the NS2 shows S-Kcore selection algorithm workable in the Pocket Switched Network. Furthermore, with the social behavior information, those key nodes are more suitable to represent and improve the whole network's performance.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Current bio-kinematic encoders use velocity, acceleration and angular information to encode human exercises. However, in exercise physiology there is a need to distinguish between the shape of the trajectory and its execution dynamics. In this paper we propose such a two-component model and explore how best to compute these components of an action. In particular, we show how a new spatial indexing scheme, derived directly from the underlying differential geometry of curves, provides robust estimates of the shape and dynamics compared to standard temporal indexing schemes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Personalized recommendation is, according to the user's interest characteristics and purchasing behavior, to recommend information and goods to users in which they may be interested. With the rapid development of Internet technology, we have entered the era of information explosion, where huge amounts of information are presented at the same time. On one hand, it is difficult for the user to discover information in which he is most interested, on the other hand, general users experience difficult in obtaining information which very few people browse. In order to extract information in which the user is interested from a massive amount of data, we propose a personalized recommendation algorithm based on approximating the singular value decomposition (SVD) in this paper. SVD is a powerful technique for dimensionality reduction. However, due to its expensive computational requirements and weak performance for large sparse matrices, it has been considered inappropriate for practical applications involving massive data. Finally, we present an empirical study to compare the prediction accuracy of our proposed algorithm with that of Drineas's LINEARTIMESVD algorithm and the standard SVD algorithm on the Movie Lens dataset, and show that our method has the best prediction quality. © 2012 IEEE.