FreeS: A fast algorithm to discover frequent free subtrees using a novel canonical form


Autoria(s): Chowdhury, Israt J.; Nayak, Richi
Contribuinte(s)

Wang, Jianyong

Cellary, Wojciech

Data(s)

2015

Resumo

Web data can often be represented in free tree form; however, free tree mining methods seldom exist. In this paper, a computationally fast algorithm FreeS is presented to discover all frequently occurring free subtrees in a database of labelled free trees. FreeS is designed using an optimal canonical form, BOCF that can uniquely represent free trees even during the presence of isomorphism. To avoid enumeration of false positive candidates, it utilises the enumeration approach based on a tree-structure guided scheme. This paper presents lemmas that introduce conditions to conform the generation of free tree candidates during enumeration. Empirical study using both real and synthetic datasets shows that FreeS is scalable and significantly outperforms (i.e. few orders of magnitude faster than) the state-of-the-art frequent free tree mining algorithms, HybridTreeMiner and FreeTreeMiner.

Formato

application/pdf

Identificador

http://eprints.qut.edu.au/93069/

Publicador

Springer

Relação

http://eprints.qut.edu.au/93069/8/93069.pdf

DOI:10.1007/978-3-319-26190-4_9

Chowdhury, Israt J. & Nayak, Richi (2015) FreeS: A fast algorithm to discover frequent free subtrees using a novel canonical form. In Wang, Jianyong & Cellary, Wojciech (Eds.) Web Information Systems Engineering – WISE 2015, Springer, Miami, FL, pp. 123-137.

Direitos

2015 Springer International Publishing Switzerland

Fonte

School of Electrical Engineering & Computer Science; Science & Engineering Faculty

Palavras-Chave #Web data #Free tree #Canonical form #Enumeration approach #Graph mining
Tipo

Conference Paper