929 resultados para Classifiers ensemble


Relevância:

60.00% 60.00%

Publicador:

Resumo:

The use of the maps obtained from remote sensing orbital images submitted to digital processing became fundamental to optimize conservation and monitoring actions of the coral reefs. However, the accuracy reached in the mapping of submerged areas is limited by variation of the water column that degrades the signal received by the orbital sensor and introduces errors in the final result of the classification. The limited capacity of the traditional methods based on conventional statistical techniques to solve the problems related to the inter-classes took the search of alternative strategies in the area of the Computational Intelligence. In this work an ensemble classifiers was built based on the combination of Support Vector Machines and Minimum Distance Classifier with the objective of classifying remotely sensed images of coral reefs ecosystem. The system is composed by three stages, through which the progressive refinement of the classification process happens. The patterns that received an ambiguous classification in a certain stage of the process were revalued in the subsequent stage. The prediction non ambiguous for all the data happened through the reduction or elimination of the false positive. The images were classified into five bottom-types: deep water; under-water corals; inter-tidal corals; algal and sandy bottom. The highest overall accuracy (89%) was obtained from SVM with polynomial kernel. The accuracy of the classified image was compared through the use of error matrix to the results obtained by the application of other classification methods based on a single classifier (neural network and the k-means algorithm). In the final, the comparison of results achieved demonstrated the potential of the ensemble classifiers as a tool of classification of images from submerged areas subject to the noise caused by atmospheric effects and the water column

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Different data classification algorithms have been developed and applied in various areas to analyze and extract valuable information and patterns from large datasets with noise and missing values. However, none of them could consistently perform well over all datasets. To this end, ensemble methods have been suggested as the promising measures. This paper proposes a novel hybrid algorithm, which is the combination of a multi-objective Genetic Algorithm (GA) and an ensemble classifier. While the ensemble classifier, which consists of a decision tree classifier, an Artificial Neural Network (ANN) classifier, and a Support Vector Machine (SVM) classifier, is used as the classification committee, the multi-objective Genetic Algorithm is employed as the feature selector to facilitate the ensemble classifier to improve the overall sample classification accuracy while also identifying the most important features in the dataset of interest. The proposed GA-Ensemble method is tested on three benchmark datasets, and compared with each individual classifier as well as the methods based on mutual information theory, bagging and boosting. The results suggest that this GA-Ensemble method outperform other algorithms in comparison, and be a useful method for classification and feature selection problems.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This article is devoted to large multi-tier ensemble classifiers generated as ensembles of ensembles and applied to phishing websites. Our new ensemble construction is a special case of the general and productive multi-tier approach well known in information security. Many efficient multi-tier classifiers have been considered in the literature. Our new contribution is in generating new large systems as ensembles of ensembles by linking a top-tier ensemble to another middletier ensemble instead of a base classifier so that the top~ tier ensemble can generate the whole system. This automatic generation capability includes many large ensemble classifiers in two tiers simultaneously and automatically combines them into one hierarchical unified system so that one ensemble is an integral part of another one. This new construction makes it easy to set up and run such large systems. The present article concentrates on the investigation of performance of these new multi~tier ensembles for the example of detection of phishing websites. We carried out systematic experiments evaluating several essential ensemble techniques as well as more recent approaches and studying their performance as parts of multi~level ensembles with three tiers. The results presented here demonstrate that new three-tier ensemble classifiers performed better than the base classifiers and standard ensembles included in the system. This example of application to the classification of phishing websites shows that the new method of combining diverse ensemble techniques into a unified hierarchical three-tier ensemble can be applied to increase the performance of classifiers in situations where data can be processed on a large computer.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper is devoted to multi-tier ensemble classifiers for the detection and filtering of phishing emails. We introduce a new construction of ensemble classifiers, based on the well known and productive multi-tier approach. Our experiments evaluate their performance for the detection and filtering of phishing emails. The multi-tier constructions are well known and have been used to design effective classifiers for email classification and other applications previously. We investigate new multi-tier ensemble classifiers, where diverse ensemble methods are combined in a unified system by incorporating different ensembles at a lower tier as an integral part of another ensemble at the top tier. Our novel contribution is to investigate the possibility and effectiveness of combining diverse ensemble methods into one large multi-tier ensemble for the example of detection and filtering of phishing emails. Our study handled a few essential ensemble methods and more recent approaches incorporated into a combined multi-tier ensemble classifier. The results show that new large multi-tier ensemble classifiers achieved better performance compared with the outcomes of the base classifiers and ensemble classifiers incorporated in the multi-tier system. This demonstrates that the new method of combining diverse ensembles into one unified multi-tier ensemble can be applied to increase the performance of classifiers if diverse ensembles are incorporated in the system.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper is devoted to empirical investigation of novel multi-level ensemble meta classifiers for the detection and monitoring of progression of cardiac autonomic neuropathy, CAN, in diabetes patients. Our experiments relied on an extensive database and concentrated on ensembles of ensembles, or multi-level meta classifiers, for the classification of cardiac autonomic neuropathy progression. First, we carried out a thorough investigation comparing the performance of various base classifiers for several known sets of the most essential features in this database and determined that Random Forest significantly and consistently outperforms all other base classifiers in this new application. Second, we used feature selection and ranking implemented in Random Forest. It was able to identify a new set of features, which has turned out better than all other sets considered for this large and well-known database previously. Random Forest remained the very best classier for the new set of features too. Third, we investigated meta classifiers and new multi-level meta classifiers based on Random Forest, which have improved its performance. The results obtained show that novel multi-level meta classifiers achieved further improvement and obtained new outcomes that are significantly better compared with the outcomes published in the literature previously for cardiac autonomic neuropathy.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper introduces and investigates large iterative multitier ensemble (LIME) classifiers specifically tailored for big data. These classifiers are very large, but are quite easy to generate and use. They can be so large that it makes sense to use them only for big data. They are generated automatically as a result of several iterations in applying ensemble meta classifiers. They incorporate diverse ensemble meta classifiers into several tiers simultaneously and combine them into one automatically generated iterative system so that many ensemble meta classifiers function as integral parts of other ensemble meta classifiers at higher tiers. In this paper, we carry out a comprehensive investigation of the performance of LIME classifiers for a problem concerning security of big data. Our experiments compare LIME classifiers with various base classifiers and standard ordinary ensemble meta classifiers. The results obtained demonstrate that LIME classifiers can significantly increase the accuracy of classifications. LIME classifiers performed better than the base classifiers and standard ensemble meta classifiers.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The aim of the present study is to define an optimally performing computer-aided diagnosis (CAD) architecture for the classification of liver tissue from non-enhanced computed tomography (CT) images into normal liver (C1), hepatic cyst (C2), hemangioma (C3), and hepatocellular carcinoma (C4). To this end, various CAD architectures, based on texture features and ensembles of classifiers (ECs), are comparatively assessed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the first part of this paper we reviewed the fingerprint classification literature from two different perspectives: the feature extraction and the classifier learning. Aiming at answering the question of which among the reviewed methods would perform better in a real implementation we end up in a discussion which showed the difficulty in answering this question. No previous comparison exists in the literature and comparisons among papers are done with different experimental frameworks. Moreover, the difficulty in implementing published methods was stated due to the lack of details in their description, parameters and the fact that no source code is shared. For this reason, in this paper we will go through a deep experimental study following the proposed double perspective. In order to do so, we have carefully implemented some of the most relevant feature extraction methods according to the explanations found in the corresponding papers and we have tested their performance with different classifiers, including those specific proposals made by the authors. Our aim is to develop an objective experimental study in a common framework, which has not been done before and which can serve as a baseline for future works on the topic. This way, we will not only test their quality, but their reusability by other researchers and will be able to indicate which proposals could be considered for future developments. Furthermore, we will show that combining different feature extraction models in an ensemble can lead to a superior performance, significantly increasing the results obtained by individual models.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In many domains when we have several competing classifiers available we want to synthesize them or some of them to get a more accurate classifier by a combination function. In this paper we propose a ‘class-indifferent’ method for combining classifier decisions represented by evidential structures called triplet and quartet, using Dempster's rule of combination. This method is unique in that it distinguishes important elements from the trivial ones in representing classifier decisions, makes use of more information than others in calculating the support for class labels and provides a practical way to apply the theoretically appealing Dempster–Shafer theory of evidence to the problem of ensemble learning. We present a formalism for modelling classifier decisions as triplet mass functions and we establish a range of formulae for combining these mass functions in order to arrive at a consensus decision. In addition we carry out a comparative study with the alternatives of simplet and dichotomous structure and also compare two combination methods, Dempster's rule and majority voting, over the UCI benchmark data, to demonstrate the advantage our approach offers. (A continuation of the work in this area that was published in IEEE Trans on KDE, and conferences)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Application of sensor-based technology within activity monitoring systems is becoming a popular technique within the smart environment paradigm. Nevertheless, the use of such an approach generates complex constructs of data, which subsequently requires the use of intricate activity recognition techniques to automatically infer the underlying activity. This paper explores a cluster-based ensemble method as a new solution for the purposes of activity recognition within smart environments. With this approach activities are modelled as collections of clusters built on different subsets of features. A classification process is performed by assigning a new instance to its closest cluster from each collection. Two different sensor data representations have been investigated, namely numeric and binary. Following the evaluation of the proposed methodology it has been demonstrated that the cluster-based ensemble method can be successfully applied as a viable option for activity recognition. Results following exposure to data collected from a range of activities indicated that the ensemble method had the ability to perform with accuracies of 94.2% and 97.5% for numeric and binary data, respectively. These results outperformed a range of single classifiers considered as benchmarks.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

One of the most popular techniques of generating classifier ensembles is known as stacking which is based on a meta-learning approach. In this paper, we introduce an alternative method to stacking which is based on cluster analysis. Similar to stacking, instances from a validation set are initially classified by all base classifiers. The output of each classifier is subsequently considered as a new attribute of the instance. Following this, a validation set is divided into clusters according to the new attributes and a small subset of the original attributes of the instances. For each cluster, we find its centroid and calculate its class label. The collection of centroids is considered as a meta-classifier. Experimental results show that the new method outperformed all benchmark methods, namely Majority Voting, Stacking J48, Stacking LR, AdaBoost J48, and Random Forest, in 12 out of 22 data sets. The proposed method has two advantageous properties: it is very robust to relatively small training sets and it can be applied in semi-supervised learning problems. We provide a theoretical investigation regarding the proposed method. This demonstrates that for the method to be successful, the base classifiers applied in the ensemble should have greater than 50% accuracy levels.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Generally classifiers tend to overfit if there is noise in the training data or there are missing values. Ensemble learning methods are often used to improve a classifier's classification accuracy. Most ensemble learning approaches aim to improve the classification accuracy of decision trees. However, alternative classifiers to decision trees exist. The recently developed Random Prism ensemble learner for classification aims to improve an alternative classification rule induction approach, the Prism family of algorithms, which addresses some of the limitations of decision trees. However, Random Prism suffers like any ensemble learner from a high computational overhead due to replication of the data and the induction of multiple base classifiers. Hence even modest sized datasets may impose a computational challenge to ensemble learners such as Random Prism. Parallelism is often used to scale up algorithms to deal with large datasets. This paper investigates parallelisation for Random Prism, implements a prototype and evaluates it empirically using a Hadoop computing cluster.