K-tree : large scale document clustering
Data(s) |
24/07/2009
|
---|---|
Resumo |
We introduce K-tree in an information retrieval context. It is an efficient approximation of the k-means clustering algorithm. Unlike k-means it forms a hierarchy of clusters. It has been extended to address issues with sparse representations. We compare performance and quality to CLUTO using document collections. The K-tree has a low time complexity that is suitable for large document collections. This tree structure allows for efficient disk based implementations where space requirements exceed that of main memory. |
Formato |
application/pdf |
Identificador | |
Publicador |
ACM |
Relação |
http://eprints.qut.edu.au/26491/2/26491.pdf DOI:10.1145/1571941.1572094 De Vries, Christopher Michael & Geva, Shlomo (2009) K-tree : large scale document clustering. In 32nd international ACM SIGIR Conference on Research and Development in Information Retrieval, 19-23 July 2009, Boston, MA. |
Direitos |
Copyright 2009 The Authors (c) 2009 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. |
Fonte |
Faculty of Science and Technology |
Palavras-Chave | #080109 Pattern Recognition and Data Mining #080201 Analysis of Algorithms and Complexity #080704 Information Retrieval and Web Search #K-tree #k-means #clustering #document clustering #search tree #performance #algorithm |
Tipo |
Conference Paper |