840 resultados para binary to multi-class classifiers
Resumo:
Membrane proteins play important roles in many biochemical processes and are also attractive targets of drug discovery for various diseases. The elucidation of membrane protein types provides clues for understanding the structure and function of proteins. Recently we developed a novel system for predicting protein subnuclear localizations. In this paper, we propose a simplified version of our system for predicting membrane protein types directly from primary protein structures, which incorporates amino acid classifications and physicochemical properties into a general form of pseudo-amino acid composition. In this simplified system, we will design a two-stage multi-class support vector machine combined with a two-step optimal feature selection process, which proves very effective in our experiments. The performance of the present method is evaluated on two benchmark datasets consisting of five types of membrane proteins. The overall accuracies of prediction for five types are 93.25% and 96.61% via the jackknife test and independent dataset test, respectively. These results indicate that our method is effective and valuable for predicting membrane protein types. A web server for the proposed method is available at http://www.juemengt.com/jcc/memty_page.php
Resumo:
Age estimation from facial images is increasingly receiving attention to solve age-based access control, age-adaptive targeted marketing, amongst other applications. Since even humans can be induced in error due to the complex biological processes involved, finding a robust method remains a research challenge today. In this paper, we propose a new framework for the integration of Active Appearance Models (AAM), Local Binary Patterns (LBP), Gabor wavelets (GW) and Local Phase Quantization (LPQ) in order to obtain a highly discriminative feature representation which is able to model shape, appearance, wrinkles and skin spots. In addition, this paper proposes a novel flexible hierarchical age estimation approach consisting of a multi-class Support Vector Machine (SVM) to classify a subject into an age group followed by a Support Vector Regression (SVR) to estimate a specific age. The errors that may happen in the classification step, caused by the hard boundaries between age classes, are compensated in the specific age estimation by a flexible overlapping of the age ranges. The performance of the proposed approach was evaluated on FG-NET Aging and MORPH Album 2 datasets and a mean absolute error (MAE) of 4.50 and 5.86 years was achieved respectively. The robustness of the proposed approach was also evaluated on a merge of both datasets and a MAE of 5.20 years was achieved. Furthermore, we have also compared the age estimation made by humans with the proposed approach and it has shown that the machine outperforms humans. The proposed approach is competitive with current state-of-the-art and it provides an additional robustness to blur, lighting and expression variance brought about by the local phase features.
Resumo:
In competitive combat sporting environments like boxing, the statistics on a boxer's performance, including the amount and type of punches thrown, provide a valuable source of data and feedback which is routinely used for coaching and performance improvement purposes. This paper presents a robust framework for the automatic classification of a boxer's punches. Overhead depth imagery is employed to alleviate challenges associated with occlusions, and robust body-part tracking is developed for the noisy time-of-flight sensors. Punch recognition is addressed through both a multi-class SVM and Random Forest classifiers. A coarse-to-fine hierarchical SVM classifier is presented based on prior knowledge of boxing punches. This framework has been applied to shadow boxing image sequences taken at the Australian Institute of Sport with 8 elite boxers. Results demonstrate the effectiveness of the proposed approach, with the hierarchical SVM classifier yielding a 96% accuracy, signifying its suitability for analysing athletes punches in boxing bouts.
Resumo:
This paper addresses the problem of maximum margin classification given the moments of class conditional densities and the false positive and false negative error rates. Using Chebyshev inequalities, the problem can be posed as a second order cone programming problem. The dual of the formulation leads to a geometric optimization problem, that of computing the distance between two ellipsoids, which is solved by an iterative algorithm. The formulation is extended to non-linear classifiers using kernel methods. The resultant classifiers are applied to the case of classification of unbalanced datasets with asymmetric costs for misclassification. Experimental results on benchmark datasets show the efficacy of the proposed method.
Resumo:
Maximum entropy approach to classification is very well studied in applied statistics and machine learning and almost all the methods that exists in literature are discriminative in nature. In this paper, we introduce a maximum entropy classification method with feature selection for large dimensional data such as text datasets that is generative in nature. To tackle the curse of dimensionality of large data sets, we employ conditional independence assumption (Naive Bayes) and we perform feature selection simultaneously, by enforcing a `maximum discrimination' between estimated class conditional densities. For two class problems, in the proposed method, we use Jeffreys (J) divergence to discriminate the class conditional densities. To extend our method to the multi-class case, we propose a completely new approach by considering a multi-distribution divergence: we replace Jeffreys divergence by Jensen-Shannon (JS) divergence to discriminate conditional densities of multiple classes. In order to reduce computational complexity, we employ a modified Jensen-Shannon divergence (JS(GM)), based on AM-GM inequality. We show that the resulting divergence is a natural generalization of Jeffreys divergence to a multiple distributions case. As far as the theoretical justifications are concerned we show that when one intends to select the best features in a generative maximum entropy approach, maximum discrimination using J-divergence emerges naturally in binary classification. Performance and comparative study of the proposed algorithms have been demonstrated on large dimensional text and gene expression datasets that show our methods scale up very well with large dimensional datasets.
Resumo:
In this paper we present some extensions to the k-means algorithm for vector quantization that permit its efficient use in image segmentation and pattern classification tasks. It is shown that by introducing state variables that correspond to certain statistics of the dynamic behavior of the algorithm, it is possible to find the representative centers fo the lower dimensional maniforlds that define the boundaries between classes, for clouds of multi-dimensional, mult-class data; this permits one, for example, to find class boundaries directly from sparse data (e.g., in image segmentation tasks) or to efficiently place centers for pattern classification (e.g., with local Gaussian classifiers). The same state variables can be used to define algorithms for determining adaptively the optimal number of centers for clouds of data with space-varying density. Some examples of the applicatin of these extensions are also given.
Resumo:
We consider the problem of detecting a large number of different classes of objects in cluttered scenes. Traditional approaches require applying a battery of different classifiers to the image, at multiple locations and scales. This can be slow and can require a lot of training data, since each classifier requires the computation of many different image features. In particular, for independently trained detectors, the (run-time) computational complexity, and the (training-time) sample complexity, scales linearly with the number of classes to be detected. It seems unlikely that such an approach will scale up to allow recognition of hundreds or thousands of objects. We present a multi-class boosting procedure (joint boosting) that reduces the computational and sample complexity, by finding common features that can be shared across the classes (and/or views). The detectors for each class are trained jointly, rather than independently. For a given performance level, the total number of features required, and therefore the computational cost, is observed to scale approximately logarithmically with the number of classes. The features selected jointly are closer to edges and generic features typical of many natural structures instead of finding specific object parts. Those generic features generalize better and reduce considerably the computational cost of an algorithm for multi-class object detection.
Resumo:
Clare, A. and King R.D. (2002) Machine learning of functional class from phenotype data. Bioinformatics 18(1) 160-166
Resumo:
To support the diverse Quality of Service (QoS) requirements of real-time (e.g. audio/video) applications in integrated services networks, several routing algorithms that allow for the reservation of the needed bandwidth over a Virtual Circuit (VC) established on one of several candidate routes have been proposed. Traditionally, such routing is done using the least-loaded concept, and thus results in balancing the load across the set of candidate routes. In a recent study, we have established the inadequacy of this load balancing practice and proposed the use of load profiling as an alternative. Load profiling techniques allow the distribution of "available" bandwidth across a set of candidate routes to match the characteristics of incoming VC QoS requests. In this paper we thoroughly characterize the performance of VC routing using load profiling and contrast it to routing using load balancing and load packing. We do so both analytically and via extensive simulations of multi-class traffic routing in Virtual Path (VP) based networks. Our findings confirm that for routing guaranteed bandwidth flows in VP networks, load balancing is not desirable as it results in VP bandwidth fragmentation, which adversely affects the likelihood of accepting new VC requests. This fragmentation is more pronounced when the granularity of VC requests is large. Typically, this occurs when a common VC is established to carry the aggregate traffic flow of many high-bandwidth real-time sources. For VP-based networks, our simulation results show that our load-profiling VC routing scheme performs better or as well as the traditional load-balancing VC routing in terms of revenue under both skewed and uniform workloads. Furthermore, load-profiling routing improves routing fairness by proactively increasing the chances of admitting high-bandwidth connections.
Resumo:
This paper proposes a new hierarchical learning structure, namely the holistic triple learning (HTL), for extending the binary support vector machine (SVM) to multi-classification problems. For an N-class problem, a HTL constructs a decision tree up to a depth of A leaf node of the decision tree is allowed to be placed with a holistic triple learning unit whose generalisation abilities are assessed and approved. Meanwhile, the remaining nodes in the decision tree each accommodate a standard binary SVM classifier. The holistic triple classifier is a regression model trained on three classes, whose training algorithm is originated from a recently proposed implementation technique, namely the least-squares support vector machine (LS-SVM). A major novelty with the holistic triple classifier is the reduced number of support vectors in the solution. For the resultant HTL-SVM, an upper bound of the generalisation error can be obtained. The time complexity of training the HTL-SVM is analysed, and is shown to be comparable to that of training the one-versus-one (1-vs.-1) SVM, particularly on small-scale datasets. Empirical studies show that the proposed HTL-SVM achieves competitive classification accuracy with a reduced number of support vectors compared to the popular 1-vs-1 alternative.
Resumo:
‘Housing in Hard Times’ was the theme of the Housing Studies Association annual conference in April 2011. The papers featured in this special issue are drawn from that conference. They examine the uneven impact of economic change on housing policy and related areas, with reference to conceptual ideas pertaining to urban marginality, inequality and class. Whilst the empirical focus of the papers is the UK, their intellectual contribution represents an attempt to ‘bring class back in’ to the housing studies literature and encourage more critical, theoretically informed scholarship.
Resumo:
The troubling concept of class: reflecting on our ‘failure’ to encourage sociology students to re-cognise their classed locations using autobiographical methods Abstract This paper provides a narrative of the four authors‟ commitment to auto/biographical methods as teachers and researchers in „new‟ universities. As they went about their work, they observed that, whereas students engage with the gendered, sexualised and racialised processes when negotiating their identities, they are reluctant or unable to conceptualise „class-ifying‟ processes as key determinants of their life chances. This general inability puzzled the authors, given the students‟ predominantly working-class backgrounds. Through application of their own stories, the authors explore the sociological significance of this pedagogical „failure‟ to account for the troubling concept of class not only in the classroom but also in contemporary society.
Resumo:
Symmetry group methods are applied to obtain all explicit group-invariant radial solutions to a class of semilinear Schr¨odinger equations in dimensions n = 1. Both focusing and defocusing cases of a power nonlinearity are considered, including the special case of the pseudo-conformal power p = 4/n relevant for critical dynamics. The methods involve, first, reduction of the Schr¨odinger equations to group-invariant semilinear complex 2nd order ordinary differential equations (ODEs) with respect to an optimal set of one-dimensional point symmetry groups, and second, use of inherited symmetries, hidden symmetries, and conditional symmetries to solve each ODE by quadratures. Through Noether’s theorem, all conservation laws arising from these point symmetry groups are listed. Some group-invariant solutions are found to exist for values of n other than just positive integers, and in such cases an alternative two-dimensional form of the Schr¨odinger equations involving an extra modulation term with a parameter m = 2−n = 0 is discussed.
Resumo:
De nombreux problèmes pratiques qui se posent dans dans le domaine de la logistique, peuvent être modélisés comme des problèmes de tournées de véhicules. De façon générale, cette famille de problèmes implique la conception de routes, débutant et se terminant à un dépôt, qui sont utilisées pour distribuer des biens à un nombre de clients géographiquement dispersé dans un contexte où les coûts associés aux routes sont minimisés. Selon le type de problème, un ou plusieurs dépôts peuvent-être présents. Les problèmes de tournées de véhicules sont parmi les problèmes combinatoires les plus difficiles à résoudre. Dans cette thèse, nous étudions un problème d’optimisation combinatoire, appartenant aux classes des problèmes de tournées de véhicules, qui est liée au contexte des réseaux de transport. Nous introduisons un nouveau problème qui est principalement inspiré des activités de collecte de lait des fermes de production, et de la redistribution du produit collecté aux usines de transformation, pour la province de Québec. Deux variantes de ce problème sont considérées. La première, vise la conception d’un plan tactique de routage pour le problème de la collecte-redistribution de lait sur un horizon donné, en supposant que le niveau de la production au cours de l’horizon est fixé. La deuxième variante, vise à fournir un plan plus précis en tenant compte de la variation potentielle de niveau de production pouvant survenir au cours de l’horizon considéré. Dans la première partie de cette thèse, nous décrivons un algorithme exact pour la première variante du problème qui se caractérise par la présence de fenêtres de temps, plusieurs dépôts, et une flotte hétérogène de véhicules, et dont l’objectif est de minimiser le coût de routage. À cette fin, le problème est modélisé comme un problème multi-attributs de tournées de véhicules. L’algorithme exact est basé sur la génération de colonnes impliquant un algorithme de plus court chemin élémentaire avec contraintes de ressources. Dans la deuxième partie, nous concevons un algorithme exact pour résoudre la deuxième variante du problème. À cette fin, le problème est modélisé comme un problème de tournées de véhicules multi-périodes prenant en compte explicitement les variations potentielles du niveau de production sur un horizon donné. De nouvelles stratégies sont proposées pour résoudre le problème de plus court chemin élémentaire avec contraintes de ressources, impliquant dans ce cas une structure particulière étant donné la caractéristique multi-périodes du problème général. Pour résoudre des instances de taille réaliste dans des temps de calcul raisonnables, une approche de résolution de nature heuristique est requise. La troisième partie propose un algorithme de recherche adaptative à grands voisinages où de nombreuses nouvelles stratégies d’exploration et d’exploitation sont proposées pour améliorer la performances de l’algorithme proposé en termes de la qualité de la solution obtenue et du temps de calcul nécessaire.
Resumo:
Reinforcement Learning (RL) refers to a class of learning algorithms in which learning system learns which action to take in different situations by using a scalar evaluation received from the environment on performing an action. RL has been successfully applied to many multi stage decision making problem (MDP) where in each stage the learning systems decides which action has to be taken. Economic Dispatch (ED) problem is an important scheduling problem in power systems, which decides the amount of generation to be allocated to each generating unit so that the total cost of generation is minimized without violating system constraints. In this paper we formulate economic dispatch problem as a multi stage decision making problem. In this paper, we also develop RL based algorithm to solve the ED problem. The performance of our algorithm is compared with other recent methods. The main advantage of our method is it can learn the schedule for all possible demands simultaneously.