J48Consolidated: an implementation of CTC algorithm for WEKA


Autoria(s): Arbelaiz Gallego, Olatz; Gurrutxaga, Ibai; Lozano, Fernando; Muguerza, Javier; Pérez, Jesús M.
Data(s)

12/02/2016

12/02/2016

12/02/2016

Resumo

The CTC algorithm, Consolidated Tree Construction algorithm, is a machine learning paradigm that was designed to solve a class imbalance problem, a fraud detection problem in the area of car insurance [1] where, besides, an explanation about the classification made was required. The algorithm is based on a decision tree construction algorithm, in this case the well-known C4.5, but it extracts knowledge from data using a set of samples instead of a single one as C4.5 does. In contrast to other methodologies based on several samples to build a classifier, such as bagging, the CTC builds a single tree and as a consequence, it obtains comprehensible classifiers. The main motivation of this implementation is to make public and available an implementation of the CTC algorithm. With this purpose we have implemented the algorithm within the well-known WEKA data mining environment http://www.cs.waikato.ac.nz/ml/weka/). WEKA is an open source project that contains a collection of machine learning algorithms written in Java for data mining tasks. J48 is the implementation of C4.5 algorithm within the WEKA package. We called J48Consolidated to the implementation of CTC algorithm based on the J48 Java class.

Identificador

EHU-KAT-IK-05-13

http://hdl.handle.net/10810/17314

Idioma(s)

eng

Direitos

info:eu-repo/semantics/openAccess

Palavras-Chave #comprehensibility #consolidated decision trees #class imbalance #resampling #inner ensembles #CTC algorithm #WEKA #J48Consolidated package
Tipo

info:eu-repo/semantics/report