Clustering Based Large Margin Classification: A Scalable Approach using SOCP Formulation


Autoria(s): Nath, Saketha J; Bhattacharyya, C; Murty, MN
Data(s)

2006

Resumo

This paper presents a novel Second Order Cone Programming (SOCP) formulation for large scale binary classification tasks. Assuming that the class conditional densities are mixture distributions, where each component of the mixture has a spherical covariance, the second order statistics of the components can be estimated efficiently using clustering algorithms like BIRCH. For each cluster, the second order moments are used to derive a second order cone constraint via a Chebyshev-Cantelli inequality. This constraint ensures that any data point in the cluster is classified correctly with a high probability. This leads to a large margin SOCP formulation whose size depends on the number of clusters rather than the number of training data points. Hence, the proposed formulation scales well for large datasets when compared to the state-of-the-art classifiers, Support Vector Machines (SVMs). Experiments on real world and synthetic datasets show that the proposed algorithm outperforms SVM solvers in terms of training time and achieves similar accuracies.

Formato

application/pdf

Identificador

http://eprints.iisc.ernet.in/41966/1/Clustering_Based.pdf

Nath, Saketha J and Bhattacharyya, C and Murty, MN (2006) Clustering Based Large Margin Classification: A Scalable Approach using SOCP Formulation. In: KDD '06 Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining , 2006, New York, NY.

Publicador

ACM Press

Relação

http://dl.acm.org/citation.cfm?doid=1150402.1150486

http://eprints.iisc.ernet.in/41966/

Palavras-Chave #Computer Science & Automation (Formerly, School of Automation)
Tipo

Conference Paper

PeerReviewed