Improving Multiclass Text Classification with the Support Vector Machine
Data(s) |
20/10/2004
20/10/2004
16/10/2001
|
---|---|
Resumo |
We compare Naive Bayes and Support Vector Machines on the task of multiclass text classification. Using a variety of approaches to combine the underlying binary classifiers, we find that SVMs substantially outperform Naive Bayes. We present full multiclass results on two well-known text data sets, including the lowest error to date on both data sets. We develop a new indicator of binary performance to show that the SVM's lower multiclass error is a result of its improved binary performance. Furthermore, we demonstrate and explore the surprising result that one-vs-all classification performs favorably compared to other approaches even though it has no error-correcting properties. |
Formato |
14 p. 1240992 bytes 1091543 bytes application/postscript application/pdf |
Identificador |
AIM-2001-026 CBCL-210 |
Idioma(s) |
en_US |
Relação |
AIM-2001-026 CBCL-210 |
Palavras-Chave | #AI #text classification #support vector machine #multiclass classification |