Cross-Guided Clustering: Transfer of Relevant Supervision across Tasks


Autoria(s): Bhattacharya, Indrajit; Godbole, Shantanu; Joshi, Sachindra; Verma, Ashish
Data(s)

01/07/2012

Resumo

Lack of supervision in clustering algorithms often leads to clusters that are not useful or interesting to human reviewers. We investigate if supervision can be automatically transferred for clustering a target task, by providing a relevant supervised partitioning of a dataset from a different source task. The target clustering is made more meaningful for the human user by trading-off intrinsic clustering goodness on the target task for alignment with relevant supervised partitions in the source task, wherever possible. We propose a cross-guided clustering algorithm that builds on traditional k-means by aligning the target clusters with source partitions. The alignment process makes use of a cross-task similarity measure that discovers hidden relationships across tasks. When the source and target tasks correspond to different domains with potentially different vocabularies, we propose a projection approach using pivot vocabularies for the cross-domain similarity measure. Using multiple real-world and synthetic datasets, we show that our approach improves clustering accuracy significantly over traditional k-means and state-of-the-art semi-supervised clustering baselines, over a wide range of data characteristics and parameter settings.

Formato

application/pdf

Identificador

http://eprints.iisc.ernet.in/45067/1/acm_tkdd_6-2_2012.pdf

Bhattacharya, Indrajit and Godbole, Shantanu and Joshi, Sachindra and Verma, Ashish (2012) Cross-Guided Clustering: Transfer of Relevant Supervision across Tasks. In: ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 6 (2).

Publicador

ASSOC COMPUTING MACHINERY

Relação

http://dx.doi.org/10.1145/2297456.2297461

http://eprints.iisc.ernet.in/45067/

Palavras-Chave #Computer Science & Automation (Formerly, School of Automation)
Tipo

Journal Article

PeerReviewed