873 resultados para Support Vector Machines and Naive Bayes Classifier
Resumo:
Objective To determine scoliosis curve types using non invasive surface acquisition, without prior knowledge from X-ray data. Methods Classification of scoliosis deformities according to curve type is used in the clinical management of scoliotic patients. In this work, we propose a robust system that can determine the scoliosis curve type from non invasive acquisition of the 3D back surface of the patients. The 3D image of the surface of the trunk is divided into patches and local geometric descriptors characterizing the back surface are computed from each patch and constitute the features. We reduce the dimensionality by using principal component analysis and retain 53 components using an overlap criterion combined with the total variance in the observed variables. In this work, a multi-class classifier is built with least-squares support vector machines (LS-SVM). The original LS-SVM formulation was modified by weighting the positive and negative samples differently and a new kernel was designed in order to achieve a robust classifier. The proposed system is validated using data from 165 patients with different scoliosis curve types. The results of our non invasive classification were compared with those obtained by an expert using X-ray images. Results The average rate of successful classification was computed using a leave-one-out cross-validation procedure. The overall accuracy of the system was 95%. As for the correct classification rates per class, we obtained 96%, 84% and 97% for the thoracic, double major and lumbar/thoracolumbar curve types, respectively. Conclusion This study shows that it is possible to find a relationship between the internal deformity and the back surface deformity in scoliosis with machine learning methods. The proposed system uses non invasive surface acquisition, which is safe for the patient as it involves no radiation. Also, the design of a specific kernel improved classification performance.
Resumo:
Regularization Networks and Support Vector Machines are techniques for solving certain problems of learning from examples -- in particular the regression problem of approximating a multivariate function from sparse data. We present both formulations in a unified framework, namely in the context of Vapnik's theory of statistical learning which provides a general foundation for the learning problem, combining functional analysis and statistics.
Resumo:
In the first part of this paper we show a similarity between the principle of Structural Risk Minimization Principle (SRM) (Vapnik, 1982) and the idea of Sparse Approximation, as defined in (Chen, Donoho and Saunders, 1995) and Olshausen and Field (1996). Then we focus on two specific (approximate) implementations of SRM and Sparse Approximation, which have been used to solve the problem of function approximation. For SRM we consider the Support Vector Machine technique proposed by V. Vapnik and his team at AT&T Bell Labs, and for Sparse Approximation we consider a modification of the Basis Pursuit De-Noising algorithm proposed by Chen, Donoho and Saunders (1995). We show that, under certain conditions, these two techniques are equivalent: they give the same solution and they require the solution of the same quadratic programming problem.
Resumo:
The Support Vector Machine (SVM) is a new and very promising classification technique developed by Vapnik and his group at AT&T Bell Labs. This new learning algorithm can be seen as an alternative training technique for Polynomial, Radial Basis Function and Multi-Layer Perceptron classifiers. An interesting property of this approach is that it is an approximate implementation of the Structural Risk Minimization (SRM) induction principle. The derivation of Support Vector Machines, its relationship with SRM, and its geometrical insight, are discussed in this paper. Training a SVM is equivalent to solve a quadratic programming problem with linear and box constraints in a number of variables equal to the number of data points. When the number of data points exceeds few thousands the problem is very challenging, because the quadratic form is completely dense, so the memory needed to store the problem grows with the square of the number of data points. Therefore, training problems arising in some real applications with large data sets are impossible to load into memory, and cannot be solved using standard non-linear constrained optimization algorithms. We present a decomposition algorithm that can be used to train SVM's over large data sets. The main idea behind the decomposition is the iterative solution of sub-problems and the evaluation of, and also establish the stopping criteria for the algorithm. We present previous approaches, as well as results and important details of our implementation of the algorithm using a second-order variant of the Reduced Gradient Method as the solver of the sub-problems. As an application of SVM's, we present preliminary results we obtained applying SVM to the problem of detecting frontal human faces in real images.
Resumo:
Objective: This paper presents a detailed study of fractal-based methods for texture characterization of mammographic mass lesions and architectural distortion. The purpose of this study is to explore the use of fractal and lacunarity analysis for the characterization and classification of both tumor lesions and normal breast parenchyma in mammography. Materials and methods: We conducted comparative evaluations of five popular fractal dimension estimation methods for the characterization of the texture of mass lesions and architectural distortion. We applied the concept of lacunarity to the description of the spatial distribution of the pixel intensities in mammographic images. These methods were tested with a set of 57 breast masses and 60 normal breast parenchyma (dataset1), and with another set of 19 architectural distortions and 41 normal breast parenchyma (dataset2). Support vector machines (SVM) were used as a pattern classification method for tumor classification. Results: Experimental results showed that the fractal dimension of region of interest (ROIs) depicting mass lesions and architectural distortion was statistically significantly lower than that of normal breast parenchyma for all five methods. Receiver operating characteristic (ROC) analysis showed that fractional Brownian motion (FBM) method generated the highest area under ROC curve (A z = 0.839 for dataset1, 0.828 for dataset2, respectively) among five methods for both datasets. Lacunarity analysis showed that the ROIs depicting mass lesions and architectural distortion had higher lacunarities than those of ROIs depicting normal breast parenchyma. The combination of FBM fractal dimension and lacunarity yielded the highest A z value (0.903 and 0.875, respectively) than those based on single feature alone for both given datasets. The application of the SVM improved the performance of the fractal-based features in differentiating tumor lesions from normal breast parenchyma by generating higher A z value. Conclusion: FBM texture model is the most appropriate model for characterizing mammographic images due to self-affinity assumption of the method being a better approximation. Lacunarity is an effective counterpart measure of the fractal dimension in texture feature extraction in mammographic images. The classification results obtained in this work suggest that the SVM is an effective method with great potential for classification in mammographic image analysis.
Resumo:
Motivation: A new method that uses support vector machines (SVMs) to predict protein secondary structure is described and evaluated. The study is designed to develop a reliable prediction method using an alternative technique and to investigate the applicability of SVMs to this type of bioinformatics problem. Methods: Binary SVMs are trained to discriminate between two structural classes. The binary classifiers are combined in several ways to predict multi-class secondary structure. Results: The average three-state prediction accuracy per protein (Q3) is estimated by cross-validation to be 77.07 ± 0.26% with a segment overlap (Sov) score of 73.32 ± 0.39%. The SVM performs similarly to the 'state-of-the-art' PSIPRED prediction method on a non-homologous test set of 121 proteins despite being trained on substantially fewer examples. A simple consensus of the SVM, PSIPRED and PROFsec achieves significantly higher prediction accuracy than the individual methods. Availability: The SVM classifier is available from the authors. Work is in progress to make the method available on-line and to integrate the SVM predictions into the PSIPRED server.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Since the beginning, some pattern recognition techniques have faced the problem of high computational burden for dataset learning. Among the most widely used techniques, we may highlight Support Vector Machines (SVM), which have obtained very promising results for data classification. However, this classifier requires an expensive training phase, which is dominated by a parameter optimization that aims to make SVM less prone to errors over the training set. In this paper, we model the problem of finding such parameters as a metaheuristic-based optimization task, which is performed through Harmony Search (HS) and some of its variants. The experimental results have showen the robustness of HS-based approaches for such task in comparison against with an exhaustive (grid) search, and also a Particle Swarm Optimization-based implementation.
Resumo:
Hundreds of Terabytes of CMS (Compact Muon Solenoid) data are being accumulated for storage day by day at the University of Nebraska-Lincoln, which is one of the eight US CMS Tier-2 sites. Managing this data includes retaining useful CMS data sets and clearing storage space for newly arriving data by deleting less useful data sets. This is an important task that is currently being done manually and it requires a large amount of time. The overall objective of this study was to develop a methodology to help identify the data sets to be deleted when there is a requirement for storage space. CMS data is stored using HDFS (Hadoop Distributed File System). HDFS logs give information regarding file access operations. Hadoop MapReduce was used to feed information in these logs to Support Vector Machines (SVMs), a machine learning algorithm applicable to classification and regression which is used in this Thesis to develop a classifier. Time elapsed in data set classification by this method is dependent on the size of the input HDFS log file since the algorithmic complexities of Hadoop MapReduce algorithms here are O(n). The SVM methodology produces a list of data sets for deletion along with their respective sizes. This methodology was also compared with a heuristic called Retention Cost which was calculated using size of the data set and the time since its last access to help decide how useful a data set is. Accuracies of both were compared by calculating the percentage of data sets predicted for deletion which were accessed at a later instance of time. Our methodology using SVMs proved to be more accurate than using the Retention Cost heuristic. This methodology could be used to solve similar problems involving other large data sets.
Resumo:
Support Vector Machines (SVMs) have achieved very good performance on different learning problems. However, the success of SVMs depends on the adequate choice of the values of a number of parameters (e.g., the kernel and regularization parameters). In the current work, we propose the combination of meta-learning and search algorithms to deal with the problem of SVM parameter selection. In this combination, given a new problem to be solved, meta-learning is employed to recommend SVM parameter values based on parameter configurations that have been successfully adopted in previous similar problems. The parameter values returned by meta-learning are then used as initial search points by a search technique, which will further explore the parameter space. In this proposal, we envisioned that the initial solutions provided by meta-learning are located in good regions of the search space (i.e. they are closer to optimum solutions). Hence, the search algorithm would need to evaluate a lower number of candidate solutions when looking for an adequate solution. In this work, we investigate the combination of meta-learning with two search algorithms: Particle Swarm Optimization and Tabu Search. The implemented hybrid algorithms were used to select the values of two SVM parameters in the regression domain. These combinations were compared with the use of the search algorithms without meta-learning. The experimental results on a set of 40 regression problems showed that, on average, the proposed hybrid methods obtained lower error rates when compared to their components applied in isolation.
Resumo:
Support Vector Machines (SVMs) are widely used classifiers for detecting physiological patterns in Human-Computer Interaction (HCI). Their success is due to their versatility, robustness and large availability of free dedicated toolboxes. Frequently in the literature, insufficient details about the SVM implementation and/or parameters selection are reported, making it impossible to reproduce study analysis and results. In order to perform an optimized classification and report a proper description of the results, it is necessary to have a comprehensive critical overview of the application of SVM. The aim of this paper is to provide a review of the usage of SVM in the determination of brain and muscle patterns for HCI, by focusing on electroencephalography (EEG) and electromyography (EMG) techniques. In particular, an overview of the basic principles of SVM theory is outlined, together with a description of several relevant literature implementations. Furthermore, details concerning reviewed papers are listed in tables, and statistics of SVM use in the literature are presented. Suitability of SVM for HCI is discussed and critical comparisons with other classifiers are reported.
Resumo:
The application of functional magnetic resonance imaging (fMRI) in neuroscience studies has increased enormously in the last decade. Although primarily used to map brain regions activated by specific stimuli, many studies have shown that fMRI can also be useful in identifying interactions between brain regions (functional and effective connectivity). Despite the widespread use of fMRI as a research tool, clinical applications of brain connectivity as studied by fMRI are not well established. One possible explanation is the lack of normal pattern, and intersubject variability-two variables that are still largely uncharacterized in most patient populations of interest. In the current study, we combine the identification of functional connectivity networks extracted by using Spearman partial correlation with the use of a one-class support vector machine in order construct a normative database. An application of this approach is illustrated using an fMRI dataset of 43 healthy Subjects performing a visual working memory task. In addition, the relationships between the results obtained and behavioral data are explored. Hum Brain Mapp 30:1068-1076, 2009. (C) 2008 Wiley-Liss. Inc.
Resumo:
Functional magnetic resonance imaging (fMRI) is currently one of the most widely used methods for studying human brain function in vivo. Although many different approaches to fMRI analysis are available, the most widely used methods employ so called ""mass-univariate"" modeling of responses in a voxel-by-voxel fashion to construct activation maps. However, it is well known that many brain processes involve networks of interacting regions and for this reason multivariate analyses might seem to be attractive alternatives to univariate approaches. The current paper focuses on one multivariate application of statistical learning theory: the statistical discrimination maps (SDM) based on support vector machine, and seeks to establish some possible interpretations when the results differ from univariate `approaches. In fact, when there are changes not only on the activation level of two conditions but also on functional connectivity, SDM seems more informative. We addressed this question using both simulations and applications to real data. We have shown that the combined use of univariate approaches and SDM yields significant new insights into brain activations not available using univariate methods alone. In the application to a visual working memory fMRI data, we demonstrated that the interaction among brain regions play a role in SDM`s power to detect discriminative voxels. (C) 2008 Elsevier B.V. All rights reserved.