802 resultados para Support Vector Machine
Resumo:
The Gaussian process latent variable model (GP-LVM) has been identified to be an effective probabilistic approach for dimensionality reduction because it can obtain a low-dimensional manifold of a data set in an unsupervised fashion. Consequently, the GP-LVM is insufficient for supervised learning tasks (e. g., classification and regression) because it ignores the class label information for dimensionality reduction. In this paper, a supervised GP-LVM is developed for supervised learning tasks, and the maximum a posteriori algorithm is introduced to estimate positions of all samples in the latent variable space. We present experimental evidences suggesting that the supervised GP-LVM is able to use the class label information effectively, and thus, it outperforms the GP-LVM and the discriminative extension of the GP-LVM consistently. The comparison with some supervised classification methods, such as Gaussian process classification and support vector machines, is also given to illustrate the advantage of the proposed method.
Resumo:
针对用于服务机器人的脑机接口系统中脑电信号模式识别精度不高,不能满足机器人多任务要求的问题,提出一种基于C-支持向量多分类机的多类复杂手操作EEG信号模式识别方法,并将其应用到复杂手操作的EEG信号模式识别试验中,实现一个4类复杂手操作的模式识别,实验结果表明,与之前用BP神经网络进行识别相比,识别率由85%提高到了90%。
Resumo:
提出一种基于支持向量机理论的车型分类器的设计方案。通过对实际车辆的图像采集、处理和分析,获取所需样本数据。采用有导师训练方法训练三个支持向量机识别器,使用测试样本对训练出的识别器进行性能测试。最后将三个识别器与表决器结合得到车型分类器。
Resumo:
Many real world image analysis problems, such as face recognition and hand pose estimation, involve recognizing a large number of classes of objects or shapes. Large margin methods, such as AdaBoost and Support Vector Machines (SVMs), often provide competitive accuracy rates, but at the cost of evaluating a large number of binary classifiers, thus making it difficult to apply such methods when thousands or millions of classes need to be recognized. This thesis proposes a filter-and-refine framework, whereby, given a test pattern, a small number of candidate classes can be identified efficiently at the filter step, and computationally expensive large margin classifiers are used to evaluate these candidates at the refine step. Two different filtering methods are proposed, ClassMap and OVA-VS (One-vs.-All classification using Vector Search). ClassMap is an embedding-based method, works for both boosted classifiers and SVMs, and tends to map the patterns and their associated classes close to each other in a vector space. OVA-VS maps OVA classifiers and test patterns to vectors based on the weights and outputs of weak classifiers of the boosting scheme. At runtime, finding the strongest-responding OVA classifier becomes a classical vector search problem, where well-known methods can be used to gain efficiency. In our experiments, the proposed methods achieve significant speed-ups, in some cases up to two orders of magnitude, compared to exhaustive evaluation of all OVA classifiers. This was achieved in hand pose recognition and face recognition systems where the number of classes ranges from 535 to 48,600.
Resumo:
Paper presented at the Cloud Forward Conference 2015, October 6th-8th, Pisa
Resumo:
Noise is one of the main factors degrading the quality of original multichannel remote sensing data and its presence influences classification efficiency, object detection, etc. Thus, pre-filtering is often used to remove noise and improve the solving of final tasks of multichannel remote sensing. Recent studies indicate that a classical model of additive noise is not adequate enough for images formed by modern multichannel sensors operating in visible and infrared bands. However, this fact is often ignored by researchers designing noise removal methods and algorithms. Because of this, we focus on the classification of multichannel remote sensing images in the case of signal-dependent noise present in component images. Three approaches to filtering of multichannel images for the considered noise model are analysed, all based on discrete cosine transform in blocks. The study is carried out not only in terms of conventional efficiency metrics used in filtering (MSE) but also in terms of multichannel data classification accuracy (probability of correct classification, confusion matrix). The proposed classification system combines the pre-processing stage where a DCT-based filter processes the blocks of the multichannel remote sensing image and the classification stage. Two modern classifiers are employed, radial basis function neural network and support vector machines. Simulations are carried out for three-channel image of Landsat TM sensor. Different cases of learning are considered: using noise-free samples of the test multichannel image, the noisy multichannel image and the pre-filtered one. It is shown that the use of the pre-filtered image for training produces better classification in comparison to the case of learning for the noisy image. It is demonstrated that the best results for both groups of quantitative criteria are provided if a proposed 3D discrete cosine transform filter equipped by variance stabilizing transform is applied. The classification results obtained for data pre-filtered in different ways are in agreement for both considered classifiers. Comparison of classifier performance is carried out as well. The radial basis neural network classifier is less sensitive to noise in original images, but after pre-filtering the performance of both classifiers is approximately the same.
Resumo:
A study was performed to determine if targeted metabolic profiling of cattle sera could be used to establish a predictive tool for identifying hormone misuse in cattle. Metabolites were assayed in heifers (n ) 5) treated with nortestosterone decanoate (0.85 mg/kg body weight), untreated heifers (n ) 5), steers (n ) 5) treated with oestradiol benzoate (0.15 mg/kg body weight) and untreated steers (n ) 5). Treatments were administered on days 0, 14, and 28 throughout a 42 day study period. Two support vector machines (SVMs) were trained, respectively, from heifer and steer data to identify hormonetreated animals. Performance of both SVM classifiers were evaluated by sensitivity and specificity of treatment prediction. The SVM trained on steer data achieved 97.33% sensitivity and 93.85% specificity while the one on heifer data achieved 94.67% sensitivity and 87.69% specificity. Solutions of SVM classifiers were further exploited to determine those days when classification accuracy of the SVM was most reliable. For heifers and steers, days 17-35 were determined to be the most selective. In summary, bioinformatics applied to targeted metabolic profiles generated from standard clinical chemistry analyses, has yielded an accurate, inexpensive, high-throughput test for predicting steroid abuse in cattle.
Resumo:
This paper introduces an automated computer- assisted system for the diagnosis of cervical intraepithelial neoplasia (CIN) using ultra-large cervical histological digital slides. The system contains two parts: the segmentation of squamous epithelium and the diagnosis of CIN. For the segmentation, to reduce processing time, a multiresolution method is developed. The squamous epithelium layer is first segmented at a low (2X) resolution. The boundaries are further fine tuned at a higher (20X) resolution. The block-based segmentation method uses robust texture feature vectors in combination with support vector machines (SVMs) to perform classification. Medical rules are finally applied. In testing, segmentation using 31 digital slides achieves 94.25% accuracy. For the diagnosis of CIN, changes in nuclei structure and morphology along lines perpendicular to the main axis of the squamous epithelium are quantified and classified. Using multi-category SVM, perpendicular lines are classified into Normal, CIN I, CIN II, and CIN III. The robustness of the system in term of regional diagnosis is measured against pathologists' diagnoses and inter-observer variability between two pathologists is considered. Initial results suggest that the system has potential as a tool both to assist in pathologists' diagnoses, and in training.
Resumo:
The global increase in the penetration of renewable energy is pushing electrical power systems into uncharted territory, especially in terms of transient and dynamic stability. In particular, the greater penetration of wind generation in European power networks is, at times, displacing a significant capacity of conventional synchronous generation with fixed-speed induction generation and now more commonly, doubly fed induction generators. The impact of such changes in the generation mix requires careful monitoring to assess the impact on transient and dynamic stability. This study presents a measurement-based method for the early detection of power system oscillations, with consideration of mode damping, in order to raise alarms and develop strategies to actively improve power system dynamic stability and security. A method is developed based on wavelet-based support vector data description (SVDD) to detect oscillation modes in wind farm output power, which may excite dynamic instabilities in the wider system. The wavelet transform is used as a filter to identify oscillations in frequency bands, whereas the SVDD method is used to extract dominant features from different scales and generate an assessment boundary according to the extracted features. Poorly damped oscillations of a large magnitude, or that are resonant, can be alarmed to the system operator, to reduce the risk of system instability. The proposed method is exemplified using measured data from a chosen wind farm site.
Resumo:
The monitoring of multivariate systems that exhibit non-Gaussian behavior is addressed. Existing work advocates the use of independent component analysis (ICA) to extract the underlying non-Gaussian data structure. Since some of the source signals may be Gaussian, the use of principal component analysis (PCA) is proposed to capture the Gaussian and non-Gaussian source signals. A subsequent application of ICA then allows the extraction of non-Gaussian components from the retained principal components (PCs). A further contribution is the utilization of a support vector data description to determine a confidence limit for the non-Gaussian components. Finally, a statistical test is developed for determining how many non-Gaussian components are encapsulated within the retained PCs, and associated monitoring statistics are defined. The utility of the proposed scheme is demonstrated by a simulation example, and the analysis of recorded data from an industrial melter.
Resumo:
This paper presents a feature selection method for data classification, which combines a model-based variable selection technique and a fast two-stage subset selection algorithm. The relationship between a specified (and complete) set of candidate features and the class label is modelled using a non-linear full regression model which is linear-in-the-parameters. The performance of a sub-model measured by the sum of the squared-errors (SSE) is used to score the informativeness of the subset of features involved in the sub-model. The two-stage subset selection algorithm approaches a solution sub-model with the SSE being locally minimized. The features involved in the solution sub-model are selected as inputs to support vector machines (SVMs) for classification. The memory requirement of this algorithm is independent of the number of training patterns. This property makes this method suitable for applications executed in mobile devices where physical RAM memory is very limited. An application was developed for activity recognition, which implements the proposed feature selection algorithm and an SVM training procedure. Experiments are carried out with the application running on a PDA for human activity recognition using accelerometer data. A comparison with an information gain based feature selection method demonstrates the effectiveness and efficiency of the proposed algorithm.
Resumo:
To improve the performance of classification using Support Vector Machines (SVMs) while reducing the model selection time, this paper introduces Differential Evolution, a heuristic method for model selection in two-class SVMs with a RBF kernel. The model selection method and related tuning algorithm are both presented. Experimental results from application to a selection of benchmark datasets for SVMs show that this method can produce an optimized classification in less time and with higher accuracy than a classical grid search. Comparison with a Particle Swarm Optimization (PSO) based alternative is also included.
Resumo:
Background
G protein-coupled receptors (GPCRs) constitute one of the largest groupings of eukaryotic proteins, and represent a particularly lucrative set of pharmaceutical targets. They play an important role in eukaryotic signal transduction and physiology, mediating cellular responses to a diverse range of extracellular stimuli. The phylum Platyhelminthes is of considerable medical and biological importance, housing major pathogens as well as established model organisms. The recent availability of genomic data for the human blood fluke Schistosoma mansoni and the model planarian Schmidtea mediterranea paves the way for the first comprehensive effort to identify and analyze GPCRs in this important phylum.
Results
Application of a novel transmembrane-oriented approach to receptor mining led to the discovery of 117 S. mansoni GPCRs, representing all of the major families; 105 Rhodopsin, 2 Glutamate, 3 Adhesion, 2 Secretin and 5 Frizzled. Similarly, 418 Rhodopsin, 9 Glutamate, 21 Adhesion, 1 Secretin and 11 Frizzled S. mediterranea receptors were identified. Among these, we report the identification of novel receptor groupings, including a large and highly-diverged Platyhelminth-specific Rhodopsin subfamily, a planarian-specific Adhesion-like family, and atypical Glutamate-like receptors. Phylogenetic analysis was carried out following extensive gene curation. Support vector machines (SVMs) were trained and used for ligand-based classification of full-length Rhodopsin GPCRs, complementing phylogenetic and homology-based classification.
Conclusions
Genome-wide investigation of GPCRs in two platyhelminth genomes reveals an extensive and complex receptor signaling repertoire with many unique features. This work provides important sequence and functional leads for understanding basic flatworm receptor biology, and sheds light on a lucrative set of anthelmintic drug targets.
Resumo:
The global increase in the penetration of renewable energy is pushing electrical power systems into uncharted territory, especially in terms of transient and dynamic stability. In particular, the greater penetration of wind generation in European power networks is, at times, displacing a significant capacity of conventional synchronous generation with fixed-speed induction generation and now more commonly, doubly-fed induction generators. The impact of such changes in the generation mix requires careful monitoring to assess the impact on transient and dynamic stability. This paper presents a measurement based method for the early detection of power system oscillations, with attention to mode damping, in order to raise alarms and develop strategies to actively improve power system dynamic stability and security. A method is developed based on wavelet transform and support vector data description (SVDD) to detect oscillation modes in wind farm output power, which may excite dynamic instabilities in the wider system. The wavelet transform is used as a filter to identify oscillations in different frequency bands, while SVDD is used to extract dominant features from different scales and generate an assessment boundary according to the extracted features. Poorly damped oscillations of a large magnitude or that are resonant can be alarmed to the system operator, to reduce the risk of system instability. Method evaluation is exemplified used real data from a chosen wind farm.
Resumo:
Classification methods with embedded feature selection capability are very appealing for the analysis of complex processes since they allow the analysis of root causes even when the number of input variables is high. In this work, we investigate the performance of three techniques for classification within a Monte Carlo strategy with the aim of root cause analysis. We consider the naive bayes classifier and the logistic regression model with two different implementations for controlling model complexity, namely, a LASSO-like implementation with a L1 norm regularization and a fully Bayesian implementation of the logistic model, the so called relevance vector machine. Several challenges can arise when estimating such models mainly linked to the characteristics of the data: a large number of input variables, high correlation among subsets of variables, the situation where the number of variables is higher than the number of available data points and the case of unbalanced datasets. Using an ecological and a semiconductor manufacturing dataset, we show advantages and drawbacks of each method, highlighting the superior performance in term of classification accuracy for the relevance vector machine with respect to the other classifiers. Moreover, we show how the combination of the proposed techniques and the Monte Carlo approach can be used to get more robust insights into the problem under analysis when faced with challenging modelling conditions.