The heterogeneous cluster ensemble method using hubness for clustering text documents


Autoria(s): Hou, Jun; Nayak, Richi
Contribuinte(s)

Lin, Xuemin

Manolopoulos, Yannis

Srivastava, Divesh

Huang, Guangyan

Data(s)

01/08/2013

Resumo

We propose a cluster ensemble method to map the corpus documents into the semantic space embedded in Wikipedia and group them using multiple types of feature space. A heterogeneous cluster ensemble is constructed with multiple types of relations i.e. document-term, document-concept and document-category. A final clustering solution is obtained by exploiting associations between document pairs and hubness of the documents. Empirical analysis with various real data sets reveals that the proposed meth-od outperforms state-of-the-art text clustering approaches.

Formato

application/pdf

Identificador

http://eprints.qut.edu.au/63173/

Publicador

Springer

Relação

http://eprints.qut.edu.au/63173/1/WISE201359.pdf

http://www.springer.com/computer/database+management+%26+information+retrieval/book/978-3-642-41229-5

DOI:10.1007/978-3-642-41230-1_9

Hou, Jun & Nayak, Richi (2013) The heterogeneous cluster ensemble method using hubness for clustering text documents. Lecture Notes in Computer Science [Web Information Systems Engineering - WISE 2013: 14th International Conference, Nanjing, China, October 13-15, 2013, Proceedings, Part I], 8180, pp. 102-110.

Direitos

Copyright 2013 Springer

Fonte

School of Electrical Engineering & Computer Science; Science & Engineering Faculty

Palavras-Chave #080109 Pattern Recognition and Data Mining #Text Clustering #Document Representation #Cluster Ensemble
Tipo

Journal Article