Improving Multiclass Text Classification with the Support Vector Machine


Autoria(s): Rennie, Jason D. M.; Rifkin, Ryan
Data(s)

20/10/2004

20/10/2004

16/10/2001

Resumo

We compare Naive Bayes and Support Vector Machines on the task of multiclass text classification. Using a variety of approaches to combine the underlying binary classifiers, we find that SVMs substantially outperform Naive Bayes. We present full multiclass results on two well-known text data sets, including the lowest error to date on both data sets. We develop a new indicator of binary performance to show that the SVM's lower multiclass error is a result of its improved binary performance. Furthermore, we demonstrate and explore the surprising result that one-vs-all classification performs favorably compared to other approaches even though it has no error-correcting properties.

Formato

14 p.

1240992 bytes

1091543 bytes

application/postscript

application/pdf

Identificador

AIM-2001-026

CBCL-210

http://hdl.handle.net/1721.1/7241

Idioma(s)

en_US

Relação

AIM-2001-026

CBCL-210

Palavras-Chave #AI #text classification #support vector machine #multiclass classification