Feature Discretization with Relevance and Mutual Information Criteria


Autoria(s): Ferreira, Artur Jorge; Figueiredo, Mário A. T.
Data(s)

21/04/2016

21/04/2016

2015

Resumo

Feature discretization (FD) techniques often yield adequate and compact representations of the data, suitable for machine learning and pattern recognition problems. These representations usually decrease the training time, yielding higher classification accuracy while allowing for humans to better understand and visualize the data, as compared to the use of the original features. This paper proposes two new FD techniques. The first one is based on the well-known Linde-Buzo-Gray quantization algorithm, coupled with a relevance criterion, being able perform unsupervised, supervised, or semi-supervised discretization. The second technique works in supervised mode, being based on the maximization of the mutual information between each discrete feature and the class label. Our experimental results on standard benchmark datasets show that these techniques scale up to high-dimensional data, attaining in many cases better accuracy than existing unsupervised and supervised FD approaches, while using fewer discretization intervals.

Identificador

FERREIRA, Artur J.; FIGUEIREDO, Mário A. T. - Feature Discretization with Relevance and Mutual Information Criteria. Pattern Recognition Applications and Methods. Barcelona: SPRINGER-VERLAG BERLIN, 2015. ISBN.978-3-319-12610-4. Vol. 318, pp. 101-118.

978-3-319-12610-4

978-3-319-12609-8

2194-5357

http://hdl.handle.net/10400.21/6073

10.1007/978-3-319-12610-4_7

Idioma(s)

eng

Publicador

SPRINGER-VERLAG BERLIN

Relação

http://link.springer.com/chapter/10.1007%2F978-3-319-12610-4_7

Direitos

closedAccess

Palavras-Chave #Classification #Feature discretization #Linde-Buzo-Gray #Mutual information #Quantization #Relevance #Supervised learning
Tipo

conferenceObject