Biblioteca Digital

Network traffic clustering using random forest proximities

**Autoria(s):** Wang, Yu; Xiang, Yang; Zhang, Jun
Contribuinte(s)	Kim, Dong-In Mueller, Peter
Data(s)	01/01/2013
Resumo	The recent years have seen extensive work on statistics-based network traffic classification using machine learning (ML) techniques. In the particular scenario of learning from unlabeled traffic data, some classic unsupervised clustering algorithms (e.g. K-Means and EM) have been applied but the reported results are unsatisfactory in terms of low accuracy. This paper presents a novel approach for the task, which performs clustering based on Random Forest (RF) proximities instead of Euclidean distances. The approach consists of two steps. In the first step, we derive a proximity measure for each pair of data points by performing a RF classification on the original data and a set of synthetic data. In the next step, we perform a K-Medoids clustering to partition the data points into K groups based on the proximity matrix. Evaluations have been conducted on real-world Internet traffic traces and the experimental results indicate that the proposed approach is more accurate than the previous methods.
Identificador	http://hdl.handle.net/10536/DRO/DU:30060786
Idioma(s)	eng
Publicador	IEEE
Relação	http://dro.deakin.edu.au/eserv/DU:30060786/evid-iccconf-2013.pdf http://dro.deakin.edu.au/eserv/DU:30060786/evid-networktrafficpeerreview-2013.pdf http://dro.deakin.edu.au/eserv/DU:30060786/wang-networktraffic-2013.pdf http://doi.org/10.1109/ICC.2013.6654829
Direitos	2013, IEEE
Palavras-Chave	#Clustering #Machine Learning #Traffic Analysis
Tipo	Conference Paper

Acesso ao item digital