HCX : an efficient hybrid clustering approach for XML documents


Autoria(s): Kutty, Sangeetha; Nayak, Richi; Li, Yuefeng
Data(s)

2009

Resumo

This paper proposes a novel Hybrid Clustering approach for XML documents (HCX) that first determines the structural similarity in the form of frequent subtrees and then uses these frequent subtrees to represent the constrained content of the XML documents in order to determine the content similarity. The empirical analysis reveals that the proposed method is scalable and accurate.

Formato

application/pdf

Identificador

http://eprints.qut.edu.au/29654/

Relação

http://eprints.qut.edu.au/29654/2/29654.pdf

DOI:10.1145/1600193.1600213

Kutty, Sangeetha, Nayak, Richi, & Li, Yuefeng (2009) HCX : an efficient hybrid clustering approach for XML documents. In 9th ACM Symposium on Document Engineering : DocEng 2009, 15-18 September 2009 , Munich, Germany.

Direitos

Copyright 2009 Association for Computing Machinery

Fonte

Faculty of Science and Technology; School of Information Technology

Palavras-Chave #080109 Pattern Recognition and Data Mining #XML documents #Clustering #Frequent mining #Subtree mining #Structure and content.
Tipo

Conference Paper