999 resultados para ID3 algorithm


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This document aims to describe an update of the implementation of the J48Consolidated class within WEKA platform. The J48Consolidated class implements the CTC algorithm [2][3] which builds a unique decision tree based on a set of samples. The J48Consolidated class extends WEKA’s J48 class which implements the well-known C4.5 algorithm. This implementation was described in the technical report "J48Consolidated: An implementation of CTC algorithm for WEKA". The main, but not only, change in this update is the integration of the notion of coverage in order to determine the number of samples to be generated to build a consolidated tree. We define coverage as the percentage of examples of the training sample present in –or covered by– the set of generated subsamples. So, depending on the type of samples that we use, we will need more or less samples in order to achieve a specific value of coverage.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The CTC algorithm, Consolidated Tree Construction algorithm, is a machine learning paradigm that was designed to solve a class imbalance problem, a fraud detection problem in the area of car insurance [1] where, besides, an explanation about the classification made was required. The algorithm is based on a decision tree construction algorithm, in this case the well-known C4.5, but it extracts knowledge from data using a set of samples instead of a single one as C4.5 does. In contrast to other methodologies based on several samples to build a classifier, such as bagging, the CTC builds a single tree and as a consequence, it obtains comprehensible classifiers. The main motivation of this implementation is to make public and available an implementation of the CTC algorithm. With this purpose we have implemented the algorithm within the well-known WEKA data mining environment http://www.cs.waikato.ac.nz/ml/weka/). WEKA is an open source project that contains a collection of machine learning algorithms written in Java for data mining tasks. J48 is the implementation of C4.5 algorithm within the WEKA package. We called J48Consolidated to the implementation of CTC algorithm based on the J48 Java class.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Data recovered from 11 popup satellite archival tags and 3 surgically implanted archival tags were used to analyze the movement patterns of juvenile northern bluefin tuna (Thunnus thynnus orientalis) in the eastern Pacific. The light sensors on archival and pop-up satellite transmitting archival tags (PSATs) provide data on the time of sunrise and sunset, allowing the calculation of an approximate geographic position of the animal. Light-based estimates of longitude are relatively robust but latitude estimates are prone to large degrees of error, particularly near the times of the equinoxes and when the tag is at low latitudes. Estimating latitude remains a problem for researchers using light-based geolocation algorithms and it has been suggested that sea surface temperature data from satellites may be a useful tool for refining latitude estimates. Tag data from bluefin tuna were subjected to a newly developed algorithm, called “PSAT Tracker,” which automatically matches sea surface temperature data from the tags with sea surface temperatures recorded by satellites. The results of this algorithm compared favorably to the estimates of latitude calculated with the lightbased algorithms and allowed for estimation of fish positions during times of the year when the lightbased algorithms failed. Three near one-year tracks produced by PSAT tracker showed that the fish range from the California−Oregon border to southern Baja California, Mexico, and that the majority of time is spent off the coast of central Baja Mexico. A seasonal movement pattern was evident; the fish spend winter and spring off central Baja California, and summer through fall is spent moving northward to Oregon and returning to Baja California.