Efficient supervised optimum-path forest classification for large datasets


Autoria(s): Papa, João Paulo; Falcao, Alexandre X.; de Albuquerque, Victor Hugo C.; Tavares, Joao Manuel R. S.
Contribuinte(s)

Universidade Estadual Paulista (UNESP)

Data(s)

20/05/2014

20/05/2014

01/01/2012

Resumo

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)

Processo FAPESP: 09/16206-1

Processo FAPESP: 07/52015-0

Today data acquisition technologies come up with large datasets with millions of samples for statistical analysis. This creates a tremendous challenge for pattern recognition techniques, which need to be more efficient without losing their effectiveness. We have tried to circumvent the problem by reducing it into the fast computation of an optimum-path forest (OPF) in a graph derived from the training samples. In this forest, each class may be represented by multiple trees rooted at some representative samples. The forest is a classifier that assigns to a new sample the label of its most strongly connected root. The methodology has been successfully used with different graph topologies and learning techniques. In this work, we have focused on one of the supervised approaches, which has offered considerable advantages over Support Vector Machines and Artificial Neural Networks to handle large datasets. We propose (i) a new algorithm that speeds up classification and (ii) a solution to reduce the training set size with negligible effects on the accuracy of classification, therefore further increasing its efficiency. Experimental results show the improvements with respect to our previous approach and advantages over other existing methods, which make the new method a valuable contribution for large dataset analysis. (C) 2011 Elsevier Ltd. All rights reserved.

Formato

512-520

Identificador

http://dx.doi.org/10.1016/j.patcog.2011.07.013

Pattern Recognition. Oxford: Elsevier B.V., v. 45, n. 1, p. 512-520, 2012.

0031-3203

http://hdl.handle.net/11449/8284

10.1016/j.patcog.2011.07.013

WOS:000295760700042

Idioma(s)

eng

Publicador

Elsevier B.V.

Relação

Pattern Recognition

Direitos

closedAccess

Palavras-Chave #Optimum-path forest classifiers #Support vector machines #Artificial neural networks #Pattern recognition #Machine learning
Tipo

info:eu-repo/semantics/article