921 resultados para Classifier Combination Systems
Resumo:
This thesis presents a promising boundary setting method for solving challenging issues in text classification to produce an effective text classifier. A classifier must identify boundary between classes optimally. However, after the features are selected, the boundary is still unclear with regard to mixed positive and negative documents. A classifier combination method to boost effectiveness of the classification model is also presented. The experiments carried out in the study demonstrate that the proposed classifier is promising.
Resumo:
In systems that combine the outputs of classification methods (combination systems), such as ensembles and multi-agent systems, one of the main constraints is that the base components (classifiers or agents) should be diverse among themselves. In other words, there is clearly no accuracy gain in a system that is composed of a set of identical base components. One way of increasing diversity is through the use of feature selection or data distribution methods in combination systems. In this work, an investigation of the impact of using data distribution methods among the components of combination systems will be performed. In this investigation, different methods of data distribution will be used and an analysis of the combination systems, using several different configurations, will be performed. As a result of this analysis, it is aimed to detect which combination systems are more suitable to use feature distribution among the components
Resumo:
Multi-classifier systems, also known as ensembles, have been widely used to solve several problems, because they, often, present better performance than the individual classifiers that form these systems. But, in order to do so, it s necessary that the base classifiers to be as accurate as diverse among themselves this is also known as diversity/accuracy dilemma. Given its importance, some works have investigate the ensembles behavior in context of this dilemma. However, the majority of them address homogenous ensemble, i.e., ensembles composed only of the same type of classifiers. Thus, motivated by this limitation, this thesis, using genetic algorithms, performs a detailed study on the dilemma diversity/accuracy for heterogeneous ensembles
Resumo:
Committees of classifiers may be used to improve the accuracy of classification systems, in other words, different classifiers used to solve the same problem can be combined for creating a system of greater accuracy, called committees of classifiers. To that this to succeed is necessary that the classifiers make mistakes on different objects of the problem so that the errors of a classifier are ignored by the others correct classifiers when applying the method of combination of the committee. The characteristic of classifiers of err on different objects is called diversity. However, most measures of diversity could not describe this importance. Recently, were proposed two measures of the diversity (good and bad diversity) with the aim of helping to generate more accurate committees. This paper performs an experimental analysis of these measures applied directly on the building of the committees of classifiers. The method of construction adopted is modeled as a search problem by the set of characteristics of the databases of the problem and the best set of committee members in order to find the committee of classifiers to produce the most accurate classification. This problem is solved by metaheuristic optimization techniques, in their mono and multi-objective versions. Analyzes are performed to verify if use or add the measures of good diversity and bad diversity in the optimization objectives creates more accurate committees. Thus, the contribution of this study is to determine whether the measures of good diversity and bad diversity can be used in mono-objective and multi-objective optimization techniques as optimization objectives for building committees of classifiers more accurate than those built by the same process, but using only the accuracy classification as objective of optimization
Resumo:
In this paper we present a novel approach for multispectral image contextual classification by combining iterative combinatorial optimization algorithms. The pixel-wise decision rule is defined using a Bayesian approach to combine two MRF models: a Gaussian Markov Random Field (GMRF) for the observations (likelihood) and a Potts model for the a priori knowledge, to regularize the solution in the presence of noisy data. Hence, the classification problem is stated according to a Maximum a Posteriori (MAP) framework. In order to approximate the MAP solution we apply several combinatorial optimization methods using multiple simultaneous initializations, making the solution less sensitive to the initial conditions and reducing both computational cost and time in comparison to Simulated Annealing, often unfeasible in many real image processing applications. Markov Random Field model parameters are estimated by Maximum Pseudo-Likelihood (MPL) approach, avoiding manual adjustments in the choice of the regularization parameters. Asymptotic evaluations assess the accuracy of the proposed parameter estimation procedure. To test and evaluate the proposed classification method, we adopt metrics for quantitative performance assessment (Cohen`s Kappa coefficient), allowing a robust and accurate statistical analysis. The obtained results clearly show that combining sub-optimal contextual algorithms significantly improves the classification performance, indicating the effectiveness of the proposed methodology. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
Binary image classifiction is a problem that has received much attention in recent years. In this paper we evaluate a selection of popular techniques in an effort to find a feature set/ classifier combination which generalizes well to full resolution image data. We then apply that system to images at one-half through one-sixteenth resolution, and consider the corresponding error rates. In addition, we further observe generalization performance as it depends on the number of training images, and lastly, compare the system's best error rates to that of a human performing an identical classification task given teh same set of test images.
Resumo:
Biometrics deals with the physiological and behavioral characteristics of an individual to establish identity. Fingerprint based authentication is the most advanced biometric authentication technology. The minutiae based fingerprint identification method offer reasonable identification rate. The feature minutiae map consists of about 70-100 minutia points and matching accuracy is dropping down while the size of database is growing up. Hence it is inevitable to make the size of the fingerprint feature code to be as smaller as possible so that identification may be much easier. In this research, a novel global singularity based fingerprint representation is proposed. Fingerprint baseline, which is the line between distal and intermediate phalangeal joint line in the fingerprint, is taken as the reference line. A polygon is formed with the singularities and the fingerprint baseline. The feature vectors are the polygonal angle, sides, area, type and the ridge counts in between the singularities. 100% recognition rate is achieved in this method. The method is compared with the conventional minutiae based recognition method in terms of computation time, receiver operator characteristics (ROC) and the feature vector length. Speech is a behavioural biometric modality and can be used for identification of a speaker. In this work, MFCC of text dependant speeches are computed and clustered using k-means algorithm. A backpropagation based Artificial Neural Network is trained to identify the clustered speech code. The performance of the neural network classifier is compared with the VQ based Euclidean minimum classifier. Biometric systems that use a single modality are usually affected by problems like noisy sensor data, non-universality and/or lack of distinctiveness of the biometric trait, unacceptable error rates, and spoof attacks. Multifinger feature level fusion based fingerprint recognition is developed and the performances are measured in terms of the ROC curve. Score level fusion of fingerprint and speech based recognition system is done and 100% accuracy is achieved for a considerable range of matching threshold
Resumo:
The Optimum-Path Forest (OPF) classifier is a recent and promising method for pattern recognition, with a fast training algorithm and good accuracy results. Therefore, the investigation of a combining method for this kind of classifier can be important for many applications. In this paper we report a fast method to combine OPF-based classifiers trained with disjoint training subsets. Given a fixed number of subsets, the algorithm chooses random samples, without replacement, from the original training set. Each subset accuracy is improved by a learning procedure. The final decision is given by majority vote. Experiments with simulated and real data sets showed that the proposed combining method is more efficient and effective than naive approach provided some conditions. It was also showed that OPF training step runs faster for a series of small subsets than for the whole training set. The combining scheme was also designed to support parallel or distributed processing, speeding up the procedure even more. © 2011 Springer-Verlag.
Resumo:
Classifier selection is a problem encountered by multi-biometric systems that aim to improve performance through fusion of decisions. A particular decision fusion architecture that combines multiple instances (n classifiers) and multiple samples (m attempts at each classifier) has been proposed in previous work to achieve controlled trade-off between false alarms and false rejects. Although analysis on text-dependent speaker verification has demonstrated better performance for fusion of decisions with favourable dependence compared to statistically independent decisions, the performance is not always optimal. Given a pool of instances, best performance with this architecture is obtained for certain combination of instances. Heuristic rules and diversity measures have been commonly used for classifier selection but it is shown that optimal performance is achieved for the `best combination performance' rule. As the search complexity for this rule increases exponentially with the addition of classifiers, a measure - the sequential error ratio (SER) - is proposed in this work that is specifically adapted to the characteristics of sequential fusion architecture. The proposed measure can be used to select a classifier that is most likely to produce a correct decision at each stage. Error rates for fusion of text-dependent HMM based speaker models using SER are compared with other classifier selection methodologies. SER is shown to achieve near optimal performance for sequential fusion of multiple instances with or without the use of multiple samples. The methodology applies to multiple speech utterances for telephone or internet based access control and to other systems such as multiple finger print and multiple handwriting sample based identity verification systems.
Resumo:
The use of electroacoustic analogies suggests that a source of acoustical energy (such as an engine, compressor, blower, turbine, loudspeaker, etc.) can be characterized by an acoustic source pressure ps and internal source impedance Zs, analogous to the open-circuit voltage and internal impedance of an electrical source. The present paper shows analytically that the source characteristics evaluated by means of the indirect methods are independent of the loads selected; that is, the evaluated values of ps and Zs are unique, and that the results of the different methods (including the direct method) are identical. In addition, general relations have been derived here for the transfer of source characteristics from one station to another station across one or more acoustical elements, and also for combining several sources into a single equivalent source. Finally, all the conclusions are extended to the case of a uniformly moving medium, incorporating the convective as well as dissipative effects of the mean flow.
Resumo:
The use of electroacoustic analogies suggests that a source of acoustical energy (such as an engine, compressor, blower, turbine, loudspeaker, etc.) can be characterized by an acoustic source pressure ps and internal source impedance Zs, analogous to the open-circuit voltage and internal impedance of an electrical source. The present paper shows analytically that the source characteristics evaluated by means of the indirect methods are independent of the loads selected; that is, the evaluated values of ps and Zs are unique, and that the results of the different methods (including the direct method) are identical. In addition, general relations have been derived here for the transfer of source characteristics from one station to another station across one or more acoustical elements, and also for combining several sources into a single equivalent source. Finally, all the conclusions are extended to the case of a uniformly moving medium, incorporating the convective as well as dissipative effects of the mean flow.