18 resultados para Multiclass classification problems
em CentAUR: Central Archive University of Reading - UK
Resumo:
We extend extreme learning machine (ELM) classifiers to complex Reproducing Kernel Hilbert Spaces (RKHS) where the input/output variables as well as the optimization variables are complex-valued. A new family of classifiers, called complex-valued ELM (CELM) suitable for complex-valued multiple-input–multiple-output processing is introduced. In the proposed method, the associated Lagrangian is computed using induced RKHS kernels, adopting a Wirtinger calculus approach formulated as a constrained optimization problem similarly to the conventional ELM classifier formulation. When training the CELM, the Karush–Khun–Tuker (KKT) theorem is used to solve the dual optimization problem that consists of satisfying simultaneously smallest training error as well as smallest norm of output weights criteria. The proposed formulation also addresses aspects of quaternary classification within a Clifford algebra context. For 2D complex-valued inputs, user-defined complex-coupled hyper-planes divide the classifier input space into four partitions. For 3D complex-valued inputs, the formulation generates three pairs of complex-coupled hyper-planes through orthogonal projections. The six hyper-planes then divide the 3D space into eight partitions. It is shown that the CELM problem formulation is equivalent to solving six real-valued ELM tasks, which are induced by projecting the chosen complex kernel across the different user-defined coordinate planes. A classification example of powdered samples on the basis of their terahertz spectral signatures is used to demonstrate the advantages of the CELM classifiers compared to their SVM counterparts. The proposed classifiers retain the advantages of their ELM counterparts, in that they can perform multiclass classification with lower computational complexity than SVM classifiers. Furthermore, because of their ability to perform classification tasks fast, the proposed formulations are of interest to real-time applications.
Resumo:
We consider a fully complex-valued radial basis function (RBF) network for regression and classification applications. For regression problems, the locally regularised orthogonal least squares (LROLS) algorithm aided with the D-optimality experimental design, originally derived for constructing parsimonious real-valued RBF models, is extended to the fully complex-valued RBF (CVRBF) network. Like its real-valued counterpart, the proposed algorithm aims to achieve maximised model robustness and sparsity by combining two effective and complementary approaches. The LROLS algorithm alone is capable of producing a very parsimonious model with excellent generalisation performance while the D-optimality design criterion further enhances the model efficiency and robustness. By specifying an appropriate weighting for the D-optimality cost in the combined model selecting criterion, the entire model construction procedure becomes automatic. An example of identifying a complex-valued nonlinear channel is used to illustrate the regression application of the proposed fully CVRBF network. The proposed fully CVRBF network is also applied to four-class classification problems that are typically encountered in communication systems. A complex-valued orthogonal forward selection algorithm based on the multi-class Fisher ratio of class separability measure is derived for constructing sparse CVRBF classifiers that generalise well. The effectiveness of the proposed algorithm is demonstrated using the example of nonlinear beamforming for multiple-antenna aided communication systems that employ complex-valued quadrature phase shift keying modulation scheme. (C) 2007 Elsevier B.V. All rights reserved.
Resumo:
A fundamental principle in practical nonlinear data modeling is the parsimonious principle of constructing the minimal model that explains the training data well. Leave-one-out (LOO) cross validation is often used to estimate generalization errors by choosing amongst different network architectures (M. Stone, "Cross validatory choice and assessment of statistical predictions", J. R. Stast. Soc., Ser. B, 36, pp. 117-147, 1974). Based upon the minimization of LOO criteria of either the mean squares of LOO errors or the LOO misclassification rate respectively, we present two backward elimination algorithms as model post-processing procedures for regression and classification problems. The proposed backward elimination procedures exploit an orthogonalization procedure to enable the orthogonality between the subspace as spanned by the pruned model and the deleted regressor. Subsequently, it is shown that the LOO criteria used in both algorithms can be calculated via some analytic recursive formula, as derived in this contribution, without actually splitting the estimation data set so as to reduce computational expense. Compared to most other model construction methods, the proposed algorithms are advantageous in several aspects; (i) There are no tuning parameters to be optimized through an extra validation data set; (ii) The procedure is fully automatic without an additional stopping criteria; and (iii) The model structure selection is directly based on model generalization performance. The illustrative examples on regression and classification are used to demonstrate that the proposed algorithms are viable post-processing methods to prune a model to gain extra sparsity and improved generalization.
Resumo:
This contribution proposes a powerful technique for two-class imbalanced classification problems by combining the synthetic minority over-sampling technique (SMOTE) and the particle swarm optimisation (PSO) aided radial basis function (RBF) classifier. In order to enhance the significance of the small and specific region belonging to the positive class in the decision region, the SMOTE is applied to generate synthetic instances for the positive class to balance the training data set. Based on the over-sampled training data, the RBF classifier is constructed by applying the orthogonal forward selection procedure, in which the classifier's structure and the parameters of RBF kernels are determined using a PSO algorithm based on the criterion of minimising the leave-one-out misclassification rate. The experimental results obtained on a simulated imbalanced data set and three real imbalanced data sets are presented to demonstrate the effectiveness of our proposed algorithm.
Resumo:
A two-stage linear-in-the-parameter model construction algorithm is proposed aimed at noisy two-class classification problems. The purpose of the first stage is to produce a prefiltered signal that is used as the desired output for the second stage which constructs a sparse linear-in-the-parameter classifier. The prefiltering stage is a two-level process aimed at maximizing a model's generalization capability, in which a new elastic-net model identification algorithm using singular value decomposition is employed at the lower level, and then, two regularization parameters are optimized using a particle-swarm-optimization algorithm at the upper level by minimizing the leave-one-out (LOO) misclassification rate. It is shown that the LOO misclassification rate based on the resultant prefiltered signal can be analytically computed without splitting the data set, and the associated computational cost is minimal due to orthogonality. The second stage of sparse classifier construction is based on orthogonal forward regression with the D-optimality algorithm. Extensive simulations of this approach for noisy data sets illustrate the competitiveness of this approach to classification of noisy data problems.
Resumo:
We propose a new class of neurofuzzy construction algorithms with the aim of maximizing generalization capability specifically for imbalanced data classification problems based on leave-one-out (LOO) cross validation. The algorithms are in two stages, first an initial rule base is constructed based on estimating the Gaussian mixture model with analysis of variance decomposition from input data; the second stage carries out the joint weighted least squares parameter estimation and rule selection using orthogonal forward subspace selection (OFSS)procedure. We show how different LOO based rule selection criteria can be incorporated with OFSS, and advocate either maximizing the leave-one-out area under curve of the receiver operating characteristics, or maximizing the leave-one-out Fmeasure if the data sets exhibit imbalanced class distribution. Extensive comparative simulations illustrate the effectiveness of the proposed algorithms.
Resumo:
This contribution proposes a novel probability density function (PDF) estimation based over-sampling (PDFOS) approach for two-class imbalanced classification problems. The classical Parzen-window kernel function is adopted to estimate the PDF of the positive class. Then according to the estimated PDF, synthetic instances are generated as the additional training data. The essential concept is to re-balance the class distribution of the original imbalanced data set under the principle that synthetic data sample follows the same statistical properties. Based on the over-sampled training data, the radial basis function (RBF) classifier is constructed by applying the orthogonal forward selection procedure, in which the classifier’s structure and the parameters of RBF kernels are determined using a particle swarm optimisation algorithm based on the criterion of minimising the leave-one-out misclassification rate. The effectiveness of the proposed PDFOS approach is demonstrated by the empirical study on several imbalanced data sets.
Resumo:
Self-Organizing Map (SOM) algorithm has been extensively used for analysis and classification problems. For this kind of problems, datasets become more and more large and it is necessary to speed up the SOM learning. In this paper we present an application of the Simulated Annealing (SA) procedure to the SOM learning algorithm. The goal of the algorithm is to obtain fast learning and better performance in terms of matching of input data and regularity of the obtained map. An advantage of the proposed technique is that it preserves the simplicity of the basic algorithm. Several tests, carried out on different large datasets, demonstrate the effectiveness of the proposed algorithm in comparison with the original SOM and with some of its modification introduced to speed-up the learning.
Resumo:
Hidden Markov Models (HMMs) have been successfully applied to different modelling and classification problems from different areas over the recent years. An important step in using HMMs is the initialisation of the parameters of the model as the subsequent learning of HMM’s parameters will be dependent on these values. This initialisation should take into account the knowledge about the addressed problem and also optimisation techniques to estimate the best initial parameters given a cost function, and consequently, to estimate the best log-likelihood. This paper proposes the initialisation of Hidden Markov Models parameters using the optimisation algorithm Differential Evolution with the aim to obtain the best log-likelihood.
Resumo:
In this paper a modified algorithm is suggested for developing polynomial neural network (PNN) models. Optimal partial description (PD) modeling is introduced at each layer of the PNN expansion, a task accomplished using the orthogonal least squares (OLS) method. Based on the initial PD models determined by the polynomial order and the number of PD inputs, OLS selects the most significant regressor terms reducing the output error variance. The method produces PNN models exhibiting a high level of accuracy and superior generalization capabilities. Additionally, parsimonious models are obtained comprising a considerably smaller number of parameters compared to the ones generated by means of the conventional PNN algorithm. Three benchmark examples are elaborated, including modeling of the gas furnace process as well as the iris and wine classification problems. Extensive simulation results and comparison with other methods in the literature, demonstrate the effectiveness of the suggested modeling approach.
Resumo:
In The Conduct of Inquiry in International Relations, Patrick Jackson situates methodologies in International Relations in relation to their underlying philosophical assumptions. One of his aims is to map International Relations debates in a way that ‘capture[s] current controversies’ (p. 40). This ambition is overstated: whilst Jackson’s typology is useful as a clarificatory tool, (re)classifying existing scholarship in International Relations is more problematic. One problem with Jackson’s approach is that he tends to run together the philosophical assumptions which decisively differentiate his methodologies (by stipulating a distinctive warrant for knowledge claims) and the explanatory strategies that are employed to generate such knowledge claims, suggesting that the latter are entailed by the former. In fact, the explanatory strategies which Jackson associates with each methodology reflect conventional practice in International Relations just as much as they reflect philosophical assumptions. This makes it more difficult to identify each methodology at work than Jackson implies. I illustrate this point through a critical analysis of Jackson’s controversial reclassification of Waltz as an analyticist, showing that whilst Jackson’s typology helps to expose inconsistencies in Waltz’s approach, it does not fully support the proposed reclassification. The conventional aspect of methodologies in International Relations also raises questions about the limits of Jackson’s ‘engaged pluralism’.
Resumo:
This work analyzes the use of linear discriminant models, multi-layer perceptron neural networks and wavelet networks for corporate financial distress prediction. Although simple and easy to interpret, linear models require statistical assumptions that may be unrealistic. Neural networks are able to discriminate patterns that are not linearly separable, but the large number of parameters involved in a neural model often causes generalization problems. Wavelet networks are classification models that implement nonlinear discriminant surfaces as the superposition of dilated and translated versions of a single "mother wavelet" function. In this paper, an algorithm is proposed to select dilation and translation parameters that yield a wavelet network classifier with good parsimony characteristics. The models are compared in a case study involving failed and continuing British firms in the period 1997-2000. Problems associated with over-parameterized neural networks are illustrated and the Optimal Brain Damage pruning technique is employed to obtain a parsimonious neural model. The results, supported by a re-sampling study, show that both neural and wavelet networks may be a valid alternative to classical linear discriminant models.
Resumo:
Real-world text classification tasks often suffer from poor class structure with many overlapping classes and blurred boundaries. Training data pooled from multiple sources tend to be inconsistent and contain erroneous labelling, leading to poor performance of standard text classifiers. The classification of health service products to specialized procurement classes is used to examine and quantify the extent of these problems. A novel method is presented to analyze the labelled data by selectively merging classes where there is not enough information for the classifier to distinguish them. Initial results show the method can identify the most problematic classes, which can be used either as a focus to improve the training data or to merge classes to increase confidence in the predicted results of the classifier.
Resumo:
The combination of the synthetic minority oversampling technique (SMOTE) and the radial basis function (RBF) classifier is proposed to deal with classification for imbalanced two-class data. In order to enhance the significance of the small and specific region belonging to the positive class in the decision region, the SMOTE is applied to generate synthetic instances for the positive class to balance the training data set. Based on the over-sampled training data, the RBF classifier is constructed by applying the orthogonal forward selection procedure, in which the classifier structure and the parameters of RBF kernels are determined using a particle swarm optimization algorithm based on the criterion of minimizing the leave-one-out misclassification rate. The experimental results on both simulated and real imbalanced data sets are presented to demonstrate the effectiveness of our proposed algorithm.
Resumo:
In this paper we explore classification techniques for ill-posed problems. Two classes are linearly separable in some Hilbert space X if they can be separated by a hyperplane. We investigate stable separability, i.e. the case where we have a positive distance between two separating hyperplanes. When the data in the space Y is generated by a compact operator A applied to the system states ∈ X, we will show that in general we do not obtain stable separability in Y even if the problem in X is stably separable. In particular, we show this for the case where a nonlinear classification is generated from a non-convergent family of linear classes in X. We apply our results to the problem of quality control of fuel cells where we classify fuel cells according to their efficiency. We can potentially classify a fuel cell using either some external measured magnetic field or some internal current. However we cannot measure the current directly since we cannot access the fuel cell in operation. The first possibility is to apply discrimination techniques directly to the measured magnetic fields. The second approach first reconstructs currents and then carries out the classification on the current distributions. We show that both approaches need regularization and that the regularized classifications are not equivalent in general. Finally, we investigate a widely used linear classification algorithm Fisher's linear discriminant with respect to its ill-posedness when applied to data generated via a compact integral operator. We show that the method cannot stay stable when the number of measurement points becomes large.