Reliable aggregation on network traffic for web based knowledge discovery


Autoria(s): Yu, Shui; James, Simon; Yonghong, Tian; Dou, Wanchun
Contribuinte(s)

Dai, Honghua

Liu, James N. K.

Smirnov, Evgueni

Data(s)

01/01/2012

Resumo

The web is a rich resource for information discovery, as a result web mining is a hot topic. However, a reliable mining result depends on the reliability of the data set. For every single second, the web generate huge amount of data, such as web page requests, file transportation. The data reflect human behavior in the cyber space and therefore valuable for our analysis in various disciplines, e.g. social science, network security. How to deposit the data is a challenge. An usual strategy is to save the abstract of the data, such as using aggregation functions to preserve the features of the original data with much smaller space. A key problem, however is that such information can be distorted by the presence of illegitimate traffic, e.g. botnet recruitment scanning, DDoS attack traffic, etc. An important consideration in web related knowledge discovery then is the robustness of the aggregation method , which in turn may be affected by the reliability of network traffic data. In this chapter, we first present the methods of aggregation functions, and then we employe information distances to filter out anomaly data as a preparation for web data mining.

Identificador

http://hdl.handle.net/10536/DRO/DU:30044750

Idioma(s)

eng

Publicador

Springer

Relação

http://dro.deakin.edu.au/eserv/DU:30044750/yu-reliableaggregation-2012.pdf

http://dro.deakin.edu.au/eserv/DU:30044750/yu-reliableknowledge-evid-2012.pdf

http://doi.org/10.1007/978-1-4614-1903-7_8

Direitos

2012, Springer

Tipo

Book Chapter