131 resultados para Typological Classification


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Due to the increasing unreliability of traditional port-based methods, Internet traffic classification has attracted a lot of research efforts in recent years. Quite a lot of previous papers have focused on using statistical characteristics as discriminators and applying machine learning techniques to classify the traffic flows. In this paper, we propose a novel machine learning based approach where the features are extracted from packet payload instead of flow statistics. Specifically, every flow is represented by a feature vector, in which each item indicates the occurrence of a particular token, i.e.; a common substring, in the payload. We have applied various machine learning algorithms to evaluate the idea and used different feature selection schemes to identify the critical tokens. Experimental result based on a real-world traffic data set shows that the approach can achieve high accuracy with low overhead.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The reliability of an induced classifier can be affected by several factors including the data oriented factors and the algorithm oriented factors [3]. In some cases, the reliability could also be affected by knowledge oriented factors. In this chapter, we analyze three special cases to examine the reliability of the discovered knowledge. Our case study results show that (1) in the cases of mining from low quality data, rough classification approach is more reliable than exact approach which in general tolerate to low quality data; (2) Without sufficient large size of the data, the reliability of the discovered knowledge will be decreased accordingly; (3) The reliability of point learning approach could easily be misled by noisy data. It will in most cases generate an unreliable interval and thus affect the reliability of the discovered knowledge. It is also reveals that the inexact field is a good learning strategy that could model the potentials and to improve the discovery reliability.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a human daily activity classification approach based on the sensory data collected from a single tri-axial accelerometer worn on waist belt. The classification algorithm was realized to distinguish 6 different activities including standing, jumping, sitting-down, walking, running and falling through three major steps: wavelet transformation, Principle Component Analysis (PCA)-based dimensionality reduction and followed by implementing a radial basis function (RBF) kernel Support Vector Machine (SVM) classifier. Two trials were conducted to evaluate different aspects of the classification scheme. In the first trial, the classifier was trained and evaluated by using a dataset of 420 samples collected from seven subjects by using a k-fold cross-validation method. The parameters σ and c of the RBF kernel were optimized through automatic searching in terms of yielding the highest recognition accuracy and robustness. In the second trial, the generation capability of the classifier was also validated by using the dataset collected from six new subjects. The average classification rates of 95% and 93% are obtained in trials 1 and 2, respectively. The results in trial 2 show the system is also good at classifying activity signals of new subjects. It can be concluded that the collective effects of the usage of single accelerometer sensing, the setting of the accelerometer placement and efficient classifier would make this wearable sensing system more realistic and more comfortable to be implemented for long-term human activity monitoring and classification in ambulatory environment, therefore, more acceptable by users.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis is to develop effective and efficient methodologies which can be applied to continuously improve the performance of detection and classification on malware collected over an extended period of time. The robustness of the proposed methodologies has been tested on malware collected over 2003-2010.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Max-plus algebras and more general semirings have many useful applications and have been actively investigated. On the other hand, structural matrix rings are also well known and have been considered by many authors. The main theorem of this article completely describes all optimal ideals in the more general structural matrix semirings. Originally, our investigation of these ideals was motivated by applications in data mining for the design of centroid-based classification systems, as well as for the design of multiple classification systems combining several individual classifiers.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

While the primary purpose of edge detection schemes is to be able to produce an edge map of a given image, the ability to distinguish between different feature types is also of importance. In this paper we examine feature classification based on local energy detection and show that local energy measures are intrinsically capable of making this classification because of the use of odd and even filters. The advantage of feature classification is that it allows for the elimination of certain feature types from the edge map, thus simplifying the task of object recognition.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The demand for various multimedia applications is rapidly increasing due to the recent advance in the computing and network infrastructure, together with the widespread use of digital video technology. Among the key elements for the success of these applications is how to effectively and efficiently manage and store a huge amount of audio visual information, while at the same time providing user-friendly access to the stored data. This has fueled a quickly evolving research area known as video abstraction. As the name implies, video abstraction is a mechanism for generating a short summary of a video, which can either be a sequence of stationary images (keyframes) or moving images (video skims). In terms of browsing and navigation, a good video abstract will enable the user to gain maximum information about the target video sequence in a specified time constraint or sufficient information in the minimum time. Over past years, various ideas and techniques have been proposed towards the effective abstraction of video contents. The purpose of this article is to provide a systematic classification of these works. We identify and detail, for each approach, the underlying components and how they are addressed in specific works. © 2007 ACM.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The majority of multi-class pattern classification techniques are proposed for learning from balanced datasets. However, in several real-world domains, the datasets have imbalanced data distribution, where some classes of data may have few training examples compared for other classes. In this paper we present our research in learning from imbalanced multi-class data and propose a new approach, named Multi-IM, to deal with this problem. Multi-IM derives its fundamentals from the probabilistic relational technique (PRMs-IM), designed for learning from imbalanced relational data for the two-class problem. Multi-IM extends PRMs-IM to a generalized framework for multi-class imbalanced learning for both relational and non-relational domains.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Learning a robust projection with a small number of training samples is still a challenging problem in face recognition, especially when the unseen faces have extreme variation in pose, illumination, and facial expression. To address this problem, we propose a framework formulated under statistical learning theory that facilitates robust learning of a discriminative projection. Dimensionality reduction using the projection matrix is combined with a linear classifier in the regularized framework of lasso regression. The projection matrix in conjunction with the classifier parameters are then found by solving an optimization problem over the Stiefel manifold. The experimental results on standard face databases suggest that the proposed method outperforms some recent regularized techniques when the number of training samples is small.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Gait classification is a developing research area, particularly with regards to biometrics. It aims to use the distinctive spatial and temporal characteristics of human motion to classify differing activities. As a biometric, this extends to recognising different people by the heterogeneous aspects of their gait. This research aims to use a modified deformable model, the temporal PDM, to distinguish the movements of a walking and miming person. The movement of 2D points on the moving form is used to provide input into the model and classify the type of gait present.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In statistical classification work, one method of speeding up the process is to use only a small percentage of the total parameter set available. In this paper, we apply this technique both to the classification of malware and the identification of malware from a set combined with cleanware. In order to demonstrate the usefulness of our method, we use the same sets of malware and cleanware as in an earlier paper. Using the statistical technique Information Gain (IG), we reduce the set of features used in the experiment from 7,605 to just over 1,000. The best accuracy obtained in the former paper using 7,605 features is 97.3% for malware versus cleanware detection and 97.4% for malware family classification; on the reduced feature set, we obtain a (best) accuracy of 94.6% on the malware versus cleanware test and 94.5% on the malware classification test. An interesting feature of the new tests presented here is the reduction in false negative rates by a factor of about 1/3 when compared with the results of the earlier paper. In addition, the speed with which our tests run is reduced by a factor of approximately 3/5 from the times posted for the original paper. The small loss in accuracy and improved false negative rate along with significant improvement in speed indicate that feature reduction should be further pursued as a tool to prevent algorithms from becoming intractable due to too much data.