951 resultados para one class SVM


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Compared with conventional two-class learning schemes, one-class classification simply uses a single class for training purposes. Applying one-class classification to the minorities in an imbalanced data has been shown to achieve better performance than the two-class one. In this paper, in order to make the best use of all the available information during the learning procedure, we propose a general framework which first uses the minority class for training in the one-class classification stage; and then uses both minority and majority class for estimating the generalization performance of the constructed classifier. Based upon this generalization performance measurement, parameter search algorithm selects the best parameter settings for this classifier. Experiments on UCI and Reuters text data show that one-class SVM embedded in this framework achieves much better performance than the standard one-class SVM alone and other learning schemes, such as one-class Naive Bayes, one-class nearest neighbour and neural network.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Due to the huge growth of the World Wide Web, medical images are now available in large numbers in online repositories, and there exists the need to retrieval the images through automatically extracting visual information of the medical images, which is commonly known as content-based image retrieval (CBIR). Since each feature extracted from images just characterizes certain aspect of image content, multiple features are necessarily employed to improve the retrieval performance. Meanwhile, experiments demonstrate that a special feature is not equally important for different image queries. Most of existed feature fusion methods for image retrieval only utilize query independent feature fusion or rely on explicit user weighting. In this paper, we present a novel query dependent feature fusion method for medical image retrieval based on one class support vector machine. Having considered that a special feature is not equally important for different image queries, the proposed query dependent feature fusion method can learn different feature fusion models for different image queries only based on multiply image samples provided by the user, and the learned feature fusion models can reflect the different importances of a special feature for different image queries. The experimental results on the IRMA medical image collection demonstrate that the proposed method can improve the retrieval performance effectively and can outperform existed feature fusion methods for image retrieval.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With the development of the internet, medical images are now available in large numbers in online repositories, and there exists the need to retrieval the medical images in the content-based ways through automatically extracting visual information of the medical images. Since a single feature extracted from images just characterizes certain aspect of image content, multiple features are necessarily employed to improve the retrieval performance. Furthermore, a special feature is not equally important for different image queries since a special feature has different importance in reflecting the content of different images. However, most existed feature fusion methods for image retrieval only utilize query independent feature fusion or rely on explicit user weighting. In this paper, based on multiply query samples provided by the user, we present a novel query dependent feature fusion method for medical image retrieval based on one class support vector machine. The proposed query dependent feature fusion method for medical image retrieval can learn different feature fusion models for different image queries, and the learned feature fusion models can reflect the different importance of a special feature for different image queries. The experimental results on the IRMA medical image collection demonstrate that the proposed method can improve the retrieval performance effectively and can outperform existed feature fusion methods for image retrieval.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The social media classification problems draw more and more attention in the past few years. With the rapid development of Internet and the popularity of computers, there is astronomical amount of information in the social network (social media platforms). The datasets are generally large scale and are often corrupted by noise. The presence of noise in training set has strong impact on the performance of supervised learning (classification) techniques. A budget-driven One-class SVM approach is presented in this thesis that is suitable for large scale social media data classification. Our approach is based on an existing online One-class SVM learning algorithm, referred as STOCS (Self-Tuning One-Class SVM) algorithm. To justify our choice, we first analyze the noise-resilient ability of STOCS using synthetic data. The experiments suggest that STOCS is more robust against label noise than several other existing approaches. Next, to handle big data classification problem for social media data, we introduce several budget driven features, which allow the algorithm to be trained within limited time and under limited memory requirement. Besides, the resulting algorithm can be easily adapted to changes in dynamic data with minimal computational cost. Compared with two state-of-the-art approaches, Lib-Linear and kNN, our approach is shown to be competitive with lower requirements of memory and time.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

High-dimensional problem domains pose significant challenges for anomaly detection. The presence of irrelevant features can conceal the presence of anomalies. This problem, known as the '. curse of dimensionality', is an obstacle for many anomaly detection techniques. Building a robust anomaly detection model for use in high-dimensional spaces requires the combination of an unsupervised feature extractor and an anomaly detector. While one-class support vector machines are effective at producing decision surfaces from well-behaved feature vectors, they can be inefficient at modelling the variation in large, high-dimensional datasets. Architectures such as deep belief networks (DBNs) are a promising technique for learning robust features. We present a hybrid model where an unsupervised DBN is trained to extract generic underlying features, and a one-class SVM is trained from the features learned by the DBN. Since a linear kernel can be substituted for nonlinear ones in our hybrid model without loss of accuracy, our model is scalable and computationally efficient. The experimental results show that our proposed model yields comparable anomaly detection performance with a deep autoencoder, while reducing its training and testing time by a factor of 3 and 1000, respectively.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Novelty detection arises as an important learning task in several applications. Kernel-based approach to novelty detection has been widely used due to its theoretical rigor and elegance of geometric interpretation. However, computational complexity is a major obstacle in this approach. In this paper, leveraging on the cutting-plane framework with the well-known One-Class Support Vector Machine, we present a new solution that can scale up seamlessly with data. The first solution is exact and linear when viewed through the cutting-plane; the second employed a sampling strategy that remarkably has a constant computational complexity defined relatively to the probability of approximation accuracy. Several datasets are benchmarked to demonstrate the credibility of our framework.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Identifying unusual or anomalous patterns in an underlying dataset is an important but challenging task in many applications. The focus of the unsupervised anomaly detection literature has mostly been on vectorised data. However, many applications are more naturally described using higher-order tensor representations. Approaches that vectorise tensorial data can destroy the structural information encoded in the high-dimensional space, and lead to the problem of the curse of dimensionality. In this paper we present the first unsupervised tensorial anomaly detection method, along with a randomised version of our method. Our anomaly detection method, the One-class Support Tensor Machine (1STM), is a generalisation of conventional one-class Support Vector Machines to higher-order spaces. 1STM preserves the multiway structure of tensor data, while achieving significant improvement in accuracy and efficiency over conventional vectorised methods. We then leverage the theory of nonlinear random projections to propose the Randomised 1STM (R1STM). Our empirical analysis on several real and synthetic datasets shows that our R1STM algorithm delivers comparable or better accuracy to a state-of-the-art deep learning method and traditional kernelised approaches for anomaly detection, while being approximately 100 times faster in training and testing.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the problem of one-class classification (OCC) one of the classes, the target class, has to be distinguished from all other possible objects, considered as nontargets. In many biomedical problems this situation arises, for example, in diagnosis, image based tumor recognition or analysis of electrocardiogram data. In this paper an approach to OCC based on a typicality test is experimentally compared with reference state-of-the-art OCC techniques-Gaussian, mixture of Gaussians, naive Parzen, Parzen, and support vector data description-using biomedical data sets. We evaluate the ability of the procedures using twelve experimental data sets with not necessarily continuous data. As there are few benchmark data sets for one-class classification, all data sets considered in the evaluation have multiple classes. Each class in turn is considered as the target class and the units in the other classes are considered as new units to be classified. The results of the comparison show the good performance of the typicality approach, which is available for high dimensional data; it is worth mentioning that it can be used for any kind of data (continuous, discrete, or nominal), whereas state-of-the-art approaches application is not straightforward when nominal variables are present.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In content-based image retrieval, learning from users’ feedback can be considered as an one-class classification problem. However, the OCIB method proposed in [1] suffers from the problem that it is only a one-mode method which cannot deal with multiple interest regions. In addition, it requires a pre-specified radius which is usually unavailable in real world applications. This paper overcomes these two problems by introducing ensemble learning into the OCIB method: by Bagging, we can construct a group of one-class classifiers which emphasize various parts of the data set; this is followed by a rank aggregating with which results from different parameter settings are incorporated into a single final ranking list. The experimental results show that the proposed I-OCIB method outperforms the OCIB for image retrieval applications.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Compared with conventional two-class learning schemes, one-class classification simply uses a single class in the classifier training phase. Applying one-class classification to learn from unbalanced data set is regarded as the recognition based learning and has shown to have the potential of achieving better performance. Similar to twoclass learning, parameter selection is a significant issue, especially when the classifier is sensitive to the parameters. For one-class learning scheme with the kernel function, such as one-class Support Vector Machine and Support Vector Data Description, besides the parameters involved in the kernel, there is another one-class specific parameter: the rejection rate v. In this paper, we proposed a general framework to involve the majority class in solving the parameter selection problem. In this framework, we first use the minority target class for training in the one-class classification stage; then we use both minority and majority class for estimating the generalization performance of the constructed classifier. This generalization performance is set as the optimization criteria. We employed the Grid search and Experiment Design search to attain various parameter settings. Experiments on UCI and Reuters text data show that the parameter optimized one-class classifiers outperform all the standard one-class learning schemes we examined.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The thesis investigates various machine learning approaches to reducing data dimensionality, and studies the impact of asymmetric data on learning in image retrieval. Efficient algorithms are proposed to reduce the data dimensionality. Integration strategies for one-class classification are designed to address asymmetric data issue and improve retrieval effectiveness.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Rapid growth of technical developments has created huge challenges for microphone forensics - a subcategory of audio forensic science, because of the availability of numerous digital recording devices and massive amount of recording data. Demand for fast and efficient methods to assure integrity and authenticity of information is becoming more and more important in criminal investigation nowadays. Machine learning has emerged as an important technique to support audio analysis processes of microphone forensic practitioners. However, its application to real life situations using supervised learning is still facing great challenges due to expensiveness in collecting data and updating system. In this paper, we introduce a new machine learning approach which is called One-class Classification (OCC) to be applied to microphone forensics; we demonstrate its capability on a corpus of audio samples collected from several microphones. Research results and analysis indicate that OCC has the potential to benefit microphone forensic practitioners in developing new tools and techniques for effective and efficient analysis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Rapid growth of technical developments has created huge challenges for microphone forensics - a sub-category of audio forensic science, because of the availability of numerous digital recording devices and massive amount of recording data. Demand for fast and efficient methods to assure integrity and authenticity of information is becoming more and more important in criminal investigation nowadays. Machine learning has emerged as an important technique to support audio analysis processes of microphone forensic practitioners. However, its application to real life situations using supervised learning is still facing great challenges due to expensiveness in collecting data and updating system. In this paper, we introduce a new machine learning approach which is called One-class Classification (OCC) to be applied to microphone forensics; we demonstrate its capability on a corpus of audio samples collected from several microphones. In addition, we propose a representative instance classification framework (RICF) that can effectively improve performance of OCC algorithms for recording signal with noise. Experiment results and analysis indicate that OCC has the potential to benefit microphone forensic practitioners in developing new tools and techniques for effective and efficient analysis.