Filtering duplicate items over distributed data streams


Autoria(s): Xia, Tian; Jin, Cheqing; Zhou, Xiaofang; Zhou, Aoying
Contribuinte(s)

Wenfei Fan

Zhaohui Wu

Jun Yang

Data(s)

01/01/2005

Resumo

In recent years many real time applications need to handle data streams. We consider the distributed environments in which remote data sources keep on collecting data from real world or from other data sources, and continuously push the data to a central stream processor. In these kinds of environments, significant communication is induced by the transmitting of rapid, high-volume and time-varying data streams. At the same time, the computing overhead at the central processor is also incurred. In this paper, we develop a novel filter approach, called DTFilter approach, for evaluating the windowed distinct queries in such a distributed system. DTFilter approach is based on the searching algorithm using a data structure of two height-balanced trees, and it avoids transmitting duplicate items in data streams, thus lots of network resources are saved. In addition, theoretical analysis of the time spent in performing the search, and of the amount of memory needed is provided. Extensive experiments also show that DTFilter approach owns high performance.

Identificador

http://espace.library.uq.edu.au/view/UQ:103184

Idioma(s)

eng

Publicador

Springer Berlin/Heidelberg

Palavras-Chave #E1 #280108 Database Management #700103 Information processing services
Tipo

Conference Paper