1000 resultados para CLASSIFICATION


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a new semi-supervised method to effectively improve traffic classification performance when few supervised training data are available. Existing semi supervised methods label a large proportion of testing flows as unknown flows due to limited supervised information, which severely affects the classification performance. To address this problem, we propose to incorporate flow correlation into both training and testing stages. At the training stage, we make use of flow correlation to extend the supervised data set by automatically labeling unlabeled flows according to their correlation to the pre-labeled flows. Consequently, the traffic classifier has better performance due to the extended size and quality of the supervised data sets. At the testing stage, the correlated flows are identified and classified jointly by combining their individual predictions, so as to further boost the classification accuracy. The empirical study on the real-world network traffic shows that the proposed method outperforms the state-of-the-art flow statistical feature based classification methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A critical problem for Internet traffic classification is how to obtain a high-performance statistical feature based classifier using a small set of training data. The solutions to this problem are essential to deal with the encrypted applications and the new emerging applications. In this paper, we propose a new Naive Bayes (NB) based classification scheme to tackle this problem, which utilizes two recent research findings, feature discretization and flow correlation. A new bag-of-flow (BoF) model is firstly introduced to describe the correlated flows and it leads to a new BoF-based traffic classification problem. We cast the BoF-based traffic classification as a specific classifier combination problem and theoretically analyze the classification benefit from flow aggregation. A number of combination methods are also formulated and used to aggregate the NB predictions of the correlated flows. Finally, we carry out a number of experiments on a large scale real-world network dataset. The experimental results show that the proposed scheme can achieve significantly higher classification accuracy and much faster classification speed with comparison to the state-of-the-art traffic classification methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we purpose a rule pruning strategy to reduce the number of rules in a fuzzy rule-based classification system.A confidence factor, which is formulated based on the compatibility of the rules with the input patterns is under deployed for rule pruning.The pruning strategy aims at reducing the complexity of the fuzzy classification system and, at the same time, maintaining the accuracy rate at a good level.To evaluate the effectiveness of the pruning strategy, two benchmark data sets are first tested. Then, a fault classification problem with real senor measurements collected from a power generation plant is evaluated.The results obtained are analyzed and explained, and implications of the proposed rule pruning strategy to the fuzzy classification system are discussed.