873 resultados para Support Vector Machines and Naive Bayes Classifier
Resumo:
Peer-reviewed
Resumo:
In this research, the effectiveness of Naive Bayes and Gaussian Mixture Models classifiers on segmenting exudates in retinal images is studied and the results are evaluated with metrics commonly used in medical imaging. Also, a color variation analysis of retinal images is carried out to find how effectively can retinal images be segmented using only the color information of the pixels.
Resumo:
Scoliosis treatment strategy is generally chosen according to the severity and type of the spinal curve. Currently, the curve type is determined from X-rays whose acquisition can be harmful for the patient. We propose in this paper a system that can predict the scoliosis curve type based on the analysis of the surface of the trunk. The latter is acquired and reconstructed in 3D using a non invasive multi-head digitizing system. The deformity is described by the back surface rotation, measured on several cross-sections of the trunk. A classifier composed of three support vector machines was trained and tested using the data of 97 patients with scoliosis. A prediction rate of 72.2% was obtained, showing that the use of the trunk surface for a high-level scoliosis classification is feasible and promising.
Resumo:
This is a Named Entity Based Question Answering System for Malayalam Language. Although a vast amount of information is available today in digital form, no effective information access mechanism exists to provide humans with convenient information access. Information Retrieval and Question Answering systems are the two mechanisms available now for information access. Information systems typically return a long list of documents in response to a user’s query which are to be skimmed by the user to determine whether they contain an answer. But a Question Answering System allows the user to state his/her information need as a natural language question and receives most appropriate answer in a word or a sentence or a paragraph. This system is based on Named Entity Tagging and Question Classification. Document tagging extracts useful information from the documents which will be used in finding the answer to the question. Question Classification extracts useful information from the question to determine the type of the question and the way in which the question is to be answered. Various Machine Learning methods are used to tag the documents. Rule-Based Approach is used for Question Classification. Malayalam belongs to the Dravidian family of languages and is one of the four major languages of this family. It is one of the 22 Scheduled Languages of India with official language status in the state of Kerala. It is spoken by 40 million people. Malayalam is a morphologically rich agglutinative language and relatively of free word order. Also Malayalam has a productive morphology that allows the creation of complex words which are often highly ambiguous. Document tagging tools such as Parts-of-Speech Tagger, Phrase Chunker, Named Entity Tagger, and Compound Word Splitter are developed as a part of this research work. No such tools were available for Malayalam language. Finite State Transducer, High Order Conditional Random Field, Artificial Immunity System Principles, and Support Vector Machines are the techniques used for the design of these document preprocessing tools. This research work describes how the Named Entity is used to represent the documents. Single sentence questions are used to test the system. Overall Precision and Recall obtained are 88.5% and 85.9% respectively. This work can be extended in several directions. The coverage of non-factoid questions can be increased and also it can be extended to include open domain applications. Reference Resolution and Word Sense Disambiguation techniques are suggested as the future enhancements
Resumo:
There are numerous text documents available in electronic form. More and more are becoming available every day. Such documents represent a massive amount of information that is easily accessible. Seeking value in this huge collection requires organization; much of the work of organizing documents can be automated through text classification. The accuracy and our understanding of such systems greatly influences their usefulness. In this paper, we seek 1) to advance the understanding of commonly used text classification techniques, and 2) through that understanding, improve the tools that are available for text classification. We begin by clarifying the assumptions made in the derivation of Naive Bayes, noting basic properties and proposing ways for its extension and improvement. Next, we investigate the quality of Naive Bayes parameter estimates and their impact on classification. Our analysis leads to a theorem which gives an explanation for the improvements that can be found in multiclass classification with Naive Bayes using Error-Correcting Output Codes. We use experimental evidence on two commonly-used data sets to exhibit an application of the theorem. Finally, we show fundamental flaws in a commonly-used feature selection algorithm and develop a statistics-based framework for text feature selection. Greater understanding of Naive Bayes and the properties of text allows us to make better use of it in text classification.
Resumo:
This paper describes a trainable system capable of tracking faces and facialsfeatures like eyes and nostrils and estimating basic mouth features such as sdegrees of openness and smile in real time. In developing this system, we have addressed the twin issues of image representation and algorithms for learning. We have used the invariance properties of image representations based on Haar wavelets to robustly capture various facial features. Similarly, unlike previous approaches this system is entirely trained using examples and does not rely on a priori (hand-crafted) models of facial features based on optical flow or facial musculature. The system works in several stages that begin with face detection, followed by localization of facial features and estimation of mouth parameters. Each of these stages is formulated as a problem in supervised learning from examples. We apply the new and robust technique of support vector machines (SVM) for classification in the stage of skin segmentation, face detection and eye detection. Estimation of mouth parameters is modeled as a regression from a sparse subset of coefficients (basis functions) of an overcomplete dictionary of Haar wavelets.
Resumo:
This paper investigates detection of architectural distortion in mammographic images using support vector machine. Hausdorff dimension is used to characterise the texture feature of mammographic images. Support vector machine, a learning machine based on statistical learning theory, is trained through supervised learning to detect architectural distortion. Compared to the Radial Basis Function neural networks, SVM produced more accurate classification results in distinguishing architectural distortion abnormality from normal breast parenchyma.
Resumo:
In this paper a support vector machine (SVM) approach for characterizing the feasible parameter set (FPS) in non-linear set-membership estimation problems is presented. It iteratively solves a regression problem from which an approximation of the boundary of the FPS can be determined. To guarantee convergence to the boundary the procedure includes a no-derivative line search and for an appropriate coverage of points on the FPS boundary it is suggested to start with a sequential box pavement procedure. The SVM approach is illustrated on a simple sine and exponential model with two parameters and an agro-forestry simulation model.
Resumo:
The personalised conditioning system (PCS) is widely studied. Potentially, it is able to reduce energy consumption while securing occupants’ thermal comfort requirements. It has been suggested that automatic optimised operation schemes for PCS should be introduced to avoid energy wastage and discomfort caused by inappropriate operation. In certain automatic operation schemes, personalised thermal sensation models are applied as key components to help in setting targets for PCS operation. In this research, a novel personal thermal sensation modelling method based on the C-Support Vector Classification (C-SVC) algorithm has been developed for PCS control. The personal thermal sensation modelling has been regarded as a classification problem. During the modelling process, the method ‘learns’ an occupant’s thermal preferences from his/her feedback, environmental parameters and personal physiological and behavioural factors. The modelling method has been verified by comparing the actual thermal sensation vote (TSV) with the modelled one based on 20 individual cases. Furthermore, the accuracy of each individual thermal sensation model has been compared with the outcomes of the PMV model. The results indicate that the modelling method presented in this paper is an effective tool to model personal thermal sensations and could be integrated within the PCS for optimised system operation and control.
Resumo:
Various popular machine learning techniques, like support vector machines, are originally conceived for the solution of two-class (binary) classification problems. However, a large number of real problems present more than two classes. A common approach to generalize binary learning techniques to solve problems with more than two classes, also known as multiclass classification problems, consists of hierarchically decomposing the multiclass problem into multiple binary sub-problems, whose outputs are combined to define the predicted class. This strategy results in a tree of binary classifiers, where each internal node corresponds to a binary classifier distinguishing two groups of classes and the leaf nodes correspond to the problem classes. This paper investigates how measures of the separability between classes can be employed in the construction of binary-tree-based multiclass classifiers, adapting the decompositions performed to each particular multiclass problem. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
Most studies involving statistical time series analysis rely on assumptions of linearity, which by its simplicity facilitates parameter interpretation and estimation. However, the linearity assumption may be too restrictive for many practical applications. The implementation of nonlinear models in time series analysis involves the estimation of a large set of parameters, frequently leading to overfitting problems. In this article, a predictability coefficient is estimated using a combination of nonlinear autoregressive models and the use of support vector regression in this model is explored. We illustrate the usefulness and interpretability of results by using electroencephalographic records of an epileptic patient.
Resumo:
The Support Vector Machines (SVM) has attracted increasing attention in machine learning area, particularly on classification and patterns recognition. However, in some cases it is not easy to determinate accurately the class which given pattern belongs. This thesis involves the construction of a intervalar pattern classifier using SVM in association with intervalar theory, in order to model the separation of a pattern set between distinct classes with precision, aiming to obtain an optimized separation capable to treat imprecisions contained in the initial data and generated during the computational processing. The SVM is a linear machine. In order to allow it to solve real-world problems (usually nonlinear problems), it is necessary to treat the pattern set, know as input set, transforming from nonlinear nature to linear problem. The kernel machines are responsible to do this mapping. To create the intervalar extension of SVM, both for linear and nonlinear problems, it was necessary define intervalar kernel and the Mercer s theorem (which caracterize a kernel function) to intervalar function
Resumo:
The pattern classification is one of the machine learning subareas that has the most outstanding. Among the various approaches to solve pattern classification problems, the Support Vector Machines (SVM) receive great emphasis, due to its ease of use and good generalization performance. The Least Squares formulation of SVM (LS-SVM) finds the solution by solving a set of linear equations instead of quadratic programming implemented in SVM. The LS-SVMs provide some free parameters that have to be correctly chosen to achieve satisfactory results in a given task. Despite the LS-SVMs having high performance, lots of tools have been developed to improve them, mainly the development of new classifying methods and the employment of ensembles, in other words, a combination of several classifiers. In this work, our proposal is to use an ensemble and a Genetic Algorithm (GA), search algorithm based on the evolution of species, to enhance the LSSVM classification. In the construction of this ensemble, we use a random selection of attributes of the original problem, which it splits the original problem into smaller ones where each classifier will act. So, we apply a genetic algorithm to find effective values of the LS-SVM parameters and also to find a weight vector, measuring the importance of each machine in the final classification. Finally, the final classification is obtained by a linear combination of the decision values of the LS-SVMs with the weight vector. We used several classification problems, taken as benchmarks to evaluate the performance of the algorithm and compared the results with other classifiers