17 resultados para streaming


Relevância:

10.00% 10.00%

Publicador:

Resumo:

An important application of Big Data Analytics is the real-time analysis of streaming data. Streaming data imposes unique challenges to data mining algorithms, such as concept drifts, the need to analyse the data on the fly due to unbounded data streams and scalable algorithms due to potentially high throughput of data. Real-time classification algorithms that are adaptive to concept drifts and fast exist, however, most approaches are not naturally parallel and are thus limited in their scalability. This paper presents work on the Micro-Cluster Nearest Neighbour (MC-NN) classifier. MC-NN is based on an adaptive statistical data summary based on Micro-Clusters. MC-NN is very fast and adaptive to concept drift whilst maintaining the parallel properties of the base KNN classifier. Also MC-NN is competitive compared with existing data stream classifiers in terms of accuracy and speed.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In order to gain insights into events and issues that may cause errors and outages in parts of IP networks, intelligent methods that capture and express causal relationships online (in real-time) are needed. Whereas generalised rule induction has been explored for non-streaming data applications, its application and adaptation on streaming data is mostly undeveloped or based on periodic and ad-hoc training with batch algorithms. Some association rule mining approaches for streaming data do exist, however, they can only express binary causal relationships. This paper presents the ongoing work on Online Generalised Rule Induction (OGRI) in order to create expressive and adaptive rule sets real-time that can be applied to a broad range of applications, including network telemetry data streams.