34 resultados para Supervised classifiers
Resumo:
The applications of Automatic Vowel Recognition (AVR), which is a sub-part of fundamental importance in most of the speech processing systems, vary from automatic interpretation of spoken language to biometrics. State-of-the-art systems for AVR are based on traditional machine learning models such as Artificial Neural Networks (ANNs) and Support Vector Machines (SVMs), however, such classifiers can not deal with efficiency and effectiveness at the same time, existing a gap to be explored when real-time processing is required. In this work, we present an algorithm for AVR based on the Optimum-Path Forest (OPF), which is an emergent pattern recognition technique recently introduced in literature. Adopting a supervised training procedure and using speech tags from two public datasets, we observed that OPF has outperformed ANNs, SVMs, plus other classifiers, in terms of training time and accuracy. ©2010 IEEE.
Resumo:
The presence of precipitates in metallic materials affects its durability, resistance and mechanical properties. Hence, its automatic identification by image processing and machine learning techniques may lead to reliable and efficient assessments on the materials. In this paper, we introduce four widely used supervised pattern recognition techniques to accomplish metallic precipitates segmentation in scanning electron microscope images from dissimilar welding on a Hastelloy C-276 alloy: Support Vector Machines, Optimum-Path Forest, Self Organizing Maps and a Bayesian classifier. Experimental results demonstrated that all classifiers achieved similar recognition rates with good results validated by an expert in metallographic image analysis. © 2011 Springer-Verlag Berlin Heidelberg.
Resumo:
The research on multiple classifiers systems includes the creation of an ensemble of classifiers and the proper combination of the decisions. In order to combine the decisions given by classifiers, methods related to fixed rules and decision templates are often used. Therefore, the influence and relationship between classifier decisions are often not considered in the combination schemes. In this paper we propose a framework to combine classifiers using a decision graph under a random field model and a game strategy approach to obtain the final decision. The results of combining Optimum-Path Forest (OPF) classifiers using the proposed model are reported, obtaining good performance in experiments using simulated and real data sets. The results encourage the combination of OPF ensembles and the framework to design multiple classifier systems. © 2011 Springer-Verlag.
Resumo:
Pattern recognition in large amount of data has been paramount in the last decade, since that is not straightforward to design interactive and real time classification systems. Very recently, the Optimum-Path Forest classifier was proposed to overcome such limitations, together with its training set pruning algorithm, which requires a parameter that has been empirically set up to date. In this paper, we propose a Harmony Search-based algorithm that can find near optimal values for that. The experimental results have showed that our algorithm is able to find proper values for the OPF pruning algorithm parameter. © 2011 IEEE.
Resumo:
The correct classification of sugar according to its physico-chemical characteristics directly influences the value of the product and its acceptance by the market. This study shows that using an electronic tongue system along with established techniques of supervised learning leads to the correct classification of sugar samples according to their qualities. In this paper, we offer two new real, public and non-encoded sugar datasets whose attributes were automatically collected using an electronic tongue, with and without pH controlling. Moreover, we compare the performance achieved by several established machine learning methods. Our experiments were diligently designed to ensure statistically sound results and they indicate that k-nearest neighbors method outperforms other evaluated classifiers and, hence, it can be used as a good baseline for further comparison. © 2012 IEEE.
Resumo:
In this paper we shed light over the problem of landslide automatic recognition using supervised classification, and we also introduced the OPF classifier in this context. We employed two images acquired from Geoeye-MS satellite at March-2010 in the northwest (high steep areas) and north sides (pipeline area) covering the area of Duque de Caxias city, Rio de Janeiro State, Brazil. The landslide recognition rate has been assessed through a cross-validation with 10 runnings. In regard to the classifiers, we have used OPF against SVM with Radial Basis Function for kernel mapping and a Bayesian classifier. We can conclude that OPF, Bayes and SVM achieved high recognition rates, being OPF the fastest approach. © 2012 IEEE.
Resumo:
Semi-supervised learning is applied to classification problems where only a small portion of the data items is labeled. In these cases, the reliability of the labels is a crucial factor, because mislabeled items may propagate wrong labels to a large portion or even the entire data set. This paper aims to address this problem by presenting a graph-based (network-based) semi-supervised learning method, specifically designed to handle data sets with mislabeled samples. The method uses teams of walking particles, with competitive and cooperative behavior, for label propagation in the network constructed from the input data set. The proposed model is nature-inspired and it incorporates some features to make it robust to a considerable amount of mislabeled data items. Computer simulations show the performance of the method in the presence of different percentage of mislabeled data, in networks of different sizes and average node degree. Importantly, these simulations reveals the existence of the critical points of the mislabeled subset size, below which the network is free of wrong label contamination, but above which the mislabeled samples start to propagate their labels to the rest of the network. Moreover, numerical comparisons have been made among the proposed method and other representative graph-based semi-supervised learning methods using both artificial and real-world data sets. Interestingly, the proposed method has increasing better performance than the others as the percentage of mislabeled samples is getting larger. © 2012 IEEE.
Resumo:
An important tool for the heart disease diagnosis is the analysis of electrocardiogram (ECG) signals, since the non-invasive nature and simplicity of the ECG exam. According to the application, ECG data analysis consists of steps such as preprocessing, segmentation, feature extraction and classification aiming to detect cardiac arrhythmias (i.e.; cardiac rhythm abnormalities). Aiming to made a fast and accurate cardiac arrhythmia signal classification process, we apply and analyze a recent and robust supervised graph-based pattern recognition technique, the optimum-path forest (OPF) classifier. To the best of our knowledge, it is the first time that OPF classifier is used to the ECG heartbeat signal classification task. We then compare the performance (in terms of training and testing time, accuracy, specificity, and sensitivity) of the OPF classifier to the ones of other three well-known expert system classifiers, i.e.; support vector machine (SVM), Bayesian and multilayer artificial neural network (MLP), using features extracted from six main approaches considered in literature for ECG arrhythmia analysis. In our experiments, we use the MIT-BIH Arrhythmia Database and the evaluation protocol recommended by The Association for the Advancement of Medical Instrumentation. A discussion on the obtained results shows that OPF classifier presents a robust performance, i.e.; there is no need for parameter setup, as well as a high accuracy at an extremely low computational cost. Moreover, in average, the OPF classifier yielded greater performance than the MLP and SVM classifiers in terms of classification time and accuracy, and to produce quite similar performance to the Bayesian classifier, showing to be a promising technique for ECG signal analysis. © 2012 Elsevier Ltd. All rights reserved.
Resumo:
Plant phenology is one of the most reliable indicators of species responses to global climate change, motivating the development of new technologies for phenological monitoring. Digital cameras or near remote systems have been efficiently applied as multi-channel imaging sensors, where leaf color information is extracted from the RGB (Red, Green, and Blue) color channels, and the changes in green levels are used to infer leafing patterns of plant species. In this scenario, texture information is a great ally for image analysis that has been little used in phenology studies. We monitored leaf-changing patterns of Cerrado savanna vegetation by taking daily digital images. We extract RGB channels from the digital images and correlate them with phenological changes. Additionally, we benefit from the inclusion of textural metrics for quantifying spatial heterogeneity. Our first goals are: (1) to test if color change information is able to characterize the phenological pattern of a group of species; (2) to test if the temporal variation in image texture is useful to distinguish plant species; and (3) to test if individuals from the same species may be automatically identified using digital images. In this paper, we present a machine learning approach based on multiscale classifiers to detect phenological patterns in the digital images. Our results indicate that: (1) extreme hours (morning and afternoon) are the best for identifying plant species; (2) different plant species present a different behavior with respect to the color change information; and (3) texture variation along temporal images is promising information for capturing phenological patterns. Based on those results, we suggest that individuals from the same species and functional group might be identified using digital images, and introduce a new tool to help phenology experts in the identification of new individuals from the same species in the image and their location on the ground. © 2013 Elsevier B.V. All rights reserved.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Both Semi-Supervised Leaning and Active Learning are techniques used when unlabeled data is abundant, but the process of labeling them is expensive and/or time consuming. In this paper, those two machine learning techniques are combined into a single nature-inspired method. It features particles walking on a network built from the data set, using a unique random-greedy rule to select neighbors to visit. The particles, which have both competitive and cooperative behavior, are created on the network as the result of label queries. They may be created as the algorithm executes and only nodes affected by the new particles have to be updated. Therefore, it saves execution time compared to traditional active learning frameworks, in which the learning algorithm has to be executed several times. The data items to be queried are select based on information extracted from the nodes and particles temporal dynamics. Two different rules for queries are explored in this paper, one of them is based on querying by uncertainty approaches and the other is based on data and labeled nodes distribution. Each of them may perform better than the other according to some data sets peculiarities. Experimental results on some real-world data sets are provided, and the proposed method outperforms the semi-supervised learning method, from which it is derived, in all of them.
Resumo:
Concept drift, which refers to non stationary learning problems over time, has increasing importance in machine learning and data mining. Many concept drift applications require fast response, which means an algorithm must always be (re)trained with the latest available data. But the process of data labeling is usually expensive and/or time consuming when compared to acquisition of unlabeled data, thus usually only a small fraction of the incoming data may be effectively labeled. Semi-supervised learning methods may help in this scenario, as they use both labeled and unlabeled data in the training process. However, most of them are based on assumptions that the data is static. Therefore, semi-supervised learning with concept drifts is still an open challenging task in machine learning. Recently, a particle competition and cooperation approach has been developed to realize graph-based semi-supervised learning from static data. We have extend that approach to handle data streams and concept drift. The result is a passive algorithm which uses a single classifier approach, naturally adapted to concept changes without any explicit drift detection mechanism. It has built-in mechanisms that provide a natural way of learning from new data, gradually "forgetting" older knowledge as older data items are no longer useful for the classification of newer data items. The proposed algorithm is applied to the KDD Cup 1999 Data of network intrusion, showing its effectiveness.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)