972 resultados para LS-SVM


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Features analysis is an important task which can significantly affect the performance of automatic bacteria colony picking. Unstructured environments also affect the automatic colony screening. This paper presents a novel approach for adaptive colony segmentation in unstructured environments by treating the detected peaks of intensity histograms as a morphological feature of images. In order to avoid disturbing peaks, an entropy based mean shift filter is introduced to smooth images as a preprocessing step. The relevance and importance of these features can be determined in an improved support vector machine classifier using unascertained least square estimation. Experimental results show that the proposed unascertained least square support vector machine (ULSSVM) has better recognition accuracy than the other state-of-the-art techniques, and its training process takes less time than most of the traditional approaches presented in this paper.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposes an efficient learning mechanism to build fuzzy rule-based systems through the construction of sparse least-squares support vector machines (LS-SVMs). In addition to the significantly reduced computational complexity in model training, the resultant LS-SVM-based fuzzy system is sparser while offers satisfactory generalization capability over unseen data. It is well known that the LS-SVMs have their computational advantage over conventional SVMs in the model training process; however, the model sparseness is lost, which is the main drawback of LS-SVMs. This is an open problem for the LS-SVMs. To tackle the nonsparseness issue, a new regression alternative to the Lagrangian solution for the LS-SVM is first presented. A novel efficient learning mechanism is then proposed in this paper to extract a sparse set of support vectors for generating fuzzy IF-THEN rules. This novel mechanism works in a stepwise subset selection manner, including a forward expansion phase and a backward exclusion phase in each selection step. The implementation of the algorithm is computationally very efficient due to the introduction of a few key techniques to avoid the matrix inverse operations to accelerate the training process. The computational efficiency is also confirmed by detailed computational complexity analysis. As a result, the proposed approach is not only able to achieve the sparseness of the resultant LS-SVM-based fuzzy systems but significantly reduces the amount of computational effort in model training as well. Three experimental examples are presented to demonstrate the effectiveness and efficiency of the proposed learning mechanism and the sparseness of the obtained LS-SVM-based fuzzy systems, in comparison with other SVM-based learning techniques.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The pattern classification is one of the machine learning subareas that has the most outstanding. Among the various approaches to solve pattern classification problems, the Support Vector Machines (SVM) receive great emphasis, due to its ease of use and good generalization performance. The Least Squares formulation of SVM (LS-SVM) finds the solution by solving a set of linear equations instead of quadratic programming implemented in SVM. The LS-SVMs provide some free parameters that have to be correctly chosen to achieve satisfactory results in a given task. Despite the LS-SVMs having high performance, lots of tools have been developed to improve them, mainly the development of new classifying methods and the employment of ensembles, in other words, a combination of several classifiers. In this work, our proposal is to use an ensemble and a Genetic Algorithm (GA), search algorithm based on the evolution of species, to enhance the LSSVM classification. In the construction of this ensemble, we use a random selection of attributes of the original problem, which it splits the original problem into smaller ones where each classifier will act. So, we apply a genetic algorithm to find effective values of the LS-SVM parameters and also to find a weight vector, measuring the importance of each machine in the final classification. Finally, the final classification is obtained by a linear combination of the decision values of the LS-SVMs with the weight vector. We used several classification problems, taken as benchmarks to evaluate the performance of the algorithm and compared the results with other classifiers

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Samples of Forsythia suspensa from raw (Laoqiao) and ripe (Qingqiao) fruit were analyzed with the use of HPLC-DAD and the EIS-MS techniques. Seventeen peaks were detected, and of these, twelve were identified. Most were related to the glucopyranoside molecular fragment. Samples collected from three geographical areas (Shanxi, Henan and Shandong Provinces), were discriminated with the use of hierarchical clustering analysis (HCA), discriminant analysis (DA), and principal component analysis (PCA) models, but only PCA was able to provide further information about the relationships between objects and loadings; eight peaks were related to the provinces of sample origin. The supervised classification models-K-nearest neighbor (KNN), least squares support vector machines (LS-SVM), and counter propagation artificial neural network (CP-ANN) methods, indicated successful classification but KNN produced 100% classification rate. Thus, the fruit were discriminated on the basis of their places of origin.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A combined data matrix consisting of high performance liquid chromatography–diode array detector (HPLC–DAD) and inductively coupled plasma-mass spectrometry (ICP-MS) measurements of samples from the plant roots of the Cortex moutan (CM), produced much better classification and prediction results in comparison with those obtained from either of the individual data sets. The HPLC peaks (organic components) of the CM samples, and the ICP-MS measurements (trace metal elements) were investigated with the use of principal component analysis (PCA) and the linear discriminant analysis (LDA) methods of data analysis; essentially, qualitative results suggested that discrimination of the CM samples from three different provinces was possible with the combined matrix producing best results. Another three methods, K-nearest neighbor (KNN), back-propagation artificial neural network (BP-ANN) and least squares support vector machines (LS-SVM) were applied for the classification and prediction of the samples. Again, the combined data matrix analyzed by the KNN method produced best results (100% correct; prediction set data). Additionally, multiple linear regression (MLR) was utilized to explore any relationship between the organic constituents and the metal elements of the CM samples; the extracted linear regression equations showed that the essential metals as well as some metallic pollutants were related to the organic compounds on the basis of their concentrations

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Climate change impact assessment studies involve downscaling large-scale atmospheric predictor variables (LSAPVs) simulated by general circulation models (GCMs) to site-scale meteorological variables. This article presents a least-square support vector machine (LS-SVM)-based methodology for multi-site downscaling of maximum and minimum daily temperature series. The methodology involves (1) delineation of sites in the study area into clusters based on correlation structure of predictands, (2) downscaling LSAPVs to monthly time series of predictands at a representative site identified in each of the clusters, (3) translation of the downscaled information in each cluster from the representative site to that at other sites using LS-SVM inter-site regression relationships, and (4) disaggregation of the information at each site from monthly to daily time scale using k-nearest neighbour disaggregation methodology. Effectiveness of the methodology is demonstrated by application to data pertaining to four sites in the catchment of Beas river basin, India. Simulations of Canadian coupled global climate model (CGCM3.1/T63) for four IPCC SRES scenarios namely A1B, A2, B1 and COMMIT were downscaled to future projections of the predictands in the study area. Comparison of results with those based on recently proposed multivariate multiple linear regression (MMLR) based downscaling method and multi-site multivariate statistical downscaling (MMSD) method indicate that the proposed method is promising and it can be considered as a feasible choice in statistical downscaling studies. The performance of the method in downscaling daily minimum temperature was found to be better when compared with that in downscaling daily maximum temperature. Results indicate an increase in annual average maximum and minimum temperatures at all the sites for A1B, A2 and B1 scenarios. The projected increment is high for A2 scenario, and it is followed by that for A1B, B1 and COMMIT scenarios. Projections, in general, indicated an increase in mean monthly maximum and minimum temperatures during January to February and October to December.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Este trabalho de pesquisa descreve três estudos de utilização de métodos quimiométricos para a classificação e caracterização de óleos comestíveis vegetais e seus parâmetros de qualidade através das técnicas de espectrometria de absorção molecular no infravermelho médio com transformada de Fourier e de espectrometria no infravermelho próximo, e o monitoramento da qualidade e estabilidade oxidativa do iogurte usando espectrometria de fluorescência molecular. O primeiro e segundo estudos visam à classificação e caracterização de parâmetros de qualidade de óleos comestíveis vegetais utilizando espectrometria no infravermelho médio com transformada de Fourier (FT-MIR) e no infravermelho próximo (NIR). O algoritmo de Kennard-Stone foi usado para a seleção do conjunto de validação após análise de componentes principais (PCA). A discriminação entre os óleos de canola, girassol, milho e soja foi investigada usando SVM-DA, SIMCA e PLS-DA. A predição dos parâmetros de qualidade, índice de refração e densidade relativa dos óleos, foi investigada usando os métodos de calibração multivariada dos mínimos quadrados parciais (PLS), iPLS e SVM para os dados de FT-MIR e NIR. Vários tipos de pré-processamentos, primeira derivada, correção do sinal multiplicativo (MSC), dados centrados na média, correção do sinal ortogonal (OSC) e variação normal padrão (SNV) foram utilizados, usando a raiz quadrada do erro médio quadrático de validação cruzada (RMSECV) e de predição (RMSEP) como parâmetros de avaliação. A metodologia desenvolvida para determinação de índice de refração e densidade relativa e classificação dos óleos vegetais é rápida e direta. O terceiro estudo visa à avaliação da estabilidade oxidativa e qualidade do iogurte armazenado a 4C submetido à luz direta e mantido no escuro, usando a análise dos fatores paralelos (PARAFAC) na luminescência exibida por três fluoróforos presentes no iogurte, onde pelo menos um deles está fortemente relacionado com as condições de armazenamento. O sinal fluorescente foi identificado pelo espectro de emissão e excitação das substâncias fluorescentes puras, que foram sugeridas serem vitamina A, triptofano e riboflavina. Modelos de regressão baseados nos escores do PARAFAC para a riboflavina foram desenvolvidos usando os escores obtidos no primeiro dia como variável dependente e os escores obtidos durante o armazenamento como variável independente. Foi visível o decaimento da curva analítica com o decurso do tempo da experimentação. Portanto, o teor de riboflavina pode ser considerado um bom indicador para a estabilidade do iogurte. Assim, é possível concluir que a espectroscopia de fluorescência combinada com métodos quimiométricos é um método rápido para monitorar a estabilidade oxidativa e a qualidade do iogurte

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper proposes a new hierarchical learning structure, namely the holistic triple learning (HTL), for extending the binary support vector machine (SVM) to multi-classification problems. For an N-class problem, a HTL constructs a decision tree up to a depth of A leaf node of the decision tree is allowed to be placed with a holistic triple learning unit whose generalisation abilities are assessed and approved. Meanwhile, the remaining nodes in the decision tree each accommodate a standard binary SVM classifier. The holistic triple classifier is a regression model trained on three classes, whose training algorithm is originated from a recently proposed implementation technique, namely the least-squares support vector machine (LS-SVM). A major novelty with the holistic triple classifier is the reduced number of support vectors in the solution. For the resultant HTL-SVM, an upper bound of the generalisation error can be obtained. The time complexity of training the HTL-SVM is analysed, and is shown to be comparable to that of training the one-versus-one (1-vs.-1) SVM, particularly on small-scale datasets. Empirical studies show that the proposed HTL-SVM achieves competitive classification accuracy with a reduced number of support vectors compared to the popular 1-vs-1 alternative.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

As a promising method for pattern recognition and function estimation, least squares support vector machines (LS-SVM) express the training in terms of solving a linear system instead of a quadratic programming problem as for conventional support vector machines (SVM). In this paper, by using the information provided by the equality constraint, we transform the minimization problem with a single equality constraint in LS-SVM into an unconstrained minimization problem, then propose reduced formulations for LS-SVM. By introducing this transformation, the times of using conjugate gradient (CG) method, which is a greatly time-consuming step in obtaining the numerical solution, are reduced to one instead of two as proposed by Suykens et al. (1999). The comparison on computational speed of our method with the CG method proposed by Suykens et al. and the first order and second order SMO methods on several benchmark data sets shows a reduction of training time by up to 44%. (C) 2011 Elsevier B.V. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Adolescent idiopathic scoliosis (AIS) is a deformity of the spine manifested by asymmetry and deformities of the external surface of the trunk. Classification of scoliosis deformities according to curve type is used to plan management of scoliosis patients. Currently, scoliosis curve type is determined based on X-ray exam. However, cumulative exposure to X-rays radiation significantly increases the risk for certain cancer. In this paper, we propose a robust system that can classify the scoliosis curve type from non invasive acquisition of 3D trunk surface of the patients. The 3D image of the trunk is divided into patches and local geometric descriptors characterizing the surface of the back are computed from each patch and forming the features. We perform the reduction of the dimensionality by using Principal Component Analysis and 53 components were retained. In this work a multi-class classifier is built with Least-squares support vector machine (LS-SVM) which is a kernel classifier. For this study, a new kernel was designed in order to achieve a robust classifier in comparison with polynomial and Gaussian kernel. The proposed system was validated using data of 103 patients with different scoliosis curve types diagnosed and classified by an orthopedic surgeon from the X-ray images. The average rate of successful classification was 93.3% with a better rate of prediction for the major thoracic and lumbar/thoracolumbar types.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Objective To determine scoliosis curve types using non invasive surface acquisition, without prior knowledge from X-ray data. Methods Classification of scoliosis deformities according to curve type is used in the clinical management of scoliotic patients. In this work, we propose a robust system that can determine the scoliosis curve type from non invasive acquisition of the 3D back surface of the patients. The 3D image of the surface of the trunk is divided into patches and local geometric descriptors characterizing the back surface are computed from each patch and constitute the features. We reduce the dimensionality by using principal component analysis and retain 53 components using an overlap criterion combined with the total variance in the observed variables. In this work, a multi-class classifier is built with least-squares support vector machines (LS-SVM). The original LS-SVM formulation was modified by weighting the positive and negative samples differently and a new kernel was designed in order to achieve a robust classifier. The proposed system is validated using data from 165 patients with different scoliosis curve types. The results of our non invasive classification were compared with those obtained by an expert using X-ray images. Results The average rate of successful classification was computed using a leave-one-out cross-validation procedure. The overall accuracy of the system was 95%. As for the correct classification rates per class, we obtained 96%, 84% and 97% for the thoracic, double major and lumbar/thoracolumbar curve types, respectively. Conclusion This study shows that it is possible to find a relationship between the internal deformity and the back surface deformity in scoliosis with machine learning methods. The proposed system uses non invasive surface acquisition, which is safe for the patient as it involves no radiation. Also, the design of a specific kernel improved classification performance.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The problem of impostor dataset selection for GMM-based speaker verification is addressed through the recently proposed data-driven background dataset refinement technique. The SVM-based refinement technique selects from a candidate impostor dataset those examples that are most frequently selected as support vectors when training a set of SVMs on a development corpus. This study demonstrates the versatility of dataset refinement in the task of selecting suitable impostor datasets for use in GMM-based speaker verification. The use of refined Z- and T-norm datasets provided performance gains of 15% in EER in the NIST 2006 SRE over the use of heuristically selected datasets. The refined datasets were shown to generalise well to the unseen data of the NIST 2008 SRE.