Fast adaptive real-time classification for data streams with concept drift


Autoria(s): Tennant, Mark; Stahl, Frederic; Gomes, JoãoBártolo
Data(s)

2015

Resumo

An important application of Big Data Analytics is the real-time analysis of streaming data. Streaming data imposes unique challenges to data mining algorithms, such as concept drifts, the need to analyse the data on the fly due to unbounded data streams and scalable algorithms due to potentially high throughput of data. Real-time classification algorithms that are adaptive to concept drifts and fast exist, however, most approaches are not naturally parallel and are thus limited in their scalability. This paper presents work on the Micro-Cluster Nearest Neighbour (MC-NN) classifier. MC-NN is based on an adaptive statistical data summary based on Micro-Clusters. MC-NN is very fast and adaptive to concept drift whilst maintaining the parallel properties of the base KNN classifier. Also MC-NN is competitive compared with existing data stream classifiers in terms of accuracy and speed.

Formato

text

Identificador

http://centaur.reading.ac.uk/44252/1/IDCs2015-MarkTennant-Paper24-ShortPaper.pdf

Tennant, M., Stahl, F. <http://centaur.reading.ac.uk/view/creators/90005065.html> and Gomes, J. (2015) Fast adaptive real-time classification for data streams with concept drift. In: The 8th International Conference on Internet and Distributed Computing Systems, pp. 265-272.

Idioma(s)

en

Publicador

Springer International Publishing

Relação

http://centaur.reading.ac.uk/44252/

creatorInternal Stahl, Frederic

http://dx.doi.org/10.1007/978-3-319-23237-9_23

Tipo

Conference or Workshop Item

PeerReviewed