178 resultados para machine learning


Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper presents the development of a keystroke dynamics-based user authentication system using the ARTMAP-FD neural network. The effectiveness of ARTMAPFD in classifying keystroke patterns is analyzed and compared against a number of widely used machine learning systems. The results show that ARTMAP-FD performs well against many of its counterparts in keystroke patterns classification. Apart from that, instead of using the conventional typing timing characteristics, the applicability of typing pressure to ascertaining user's identity is investigated. The experimental results show that combining both latency and pressure patterns can improve the Equal Error Rate (ERR) of the system.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In named entity recognition (NER) for biomedical literature, approaches based on combined classifiers have demonstrated great performance improvement compared to a single (best) classifier. This is mainly owed to sufficient level of diversity exhibited among classifiers, which is a selective property of classifier set. Given a large number of classifiers, how to select different classifiers to put into a classifier-ensemble is a crucial issue of multiple classifier-ensemble design. With this observation in mind, we proposed a generic genetic classifier-ensemble method for the classifier selection in biomedical NER. Various diversity measures and majority voting are considered, and disjoint feature subsets are selected to construct individual classifiers. A basic type of individual classifier – Support Vector Machine (SVM) classifier is adopted as SVM-classifier committee. A multi-objective Genetic algorithm (GA) is employed as the classifier selector to facilitate the ensemble classifier to improve the overall sample classification accuracy. The proposed approach is tested on the benchmark dataset – GENIA version 3.02 corpus, and compared with both individual best SVM classifier and SVM-classifier ensemble algorithm as well as other machine learning methods such as CRF, HMM and MEMM. The results show that the proposed approach outperforms other classification algorithms and can be a useful method for the biomedical NER problem.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Due to the limitations of the traditional port-based and payload-based traffic classification approaches, the past decade has seen extensive work on utilizing machine learning techniques to classify network traffic based on packet and flow level features. In particular, previous studies have shown that the unsupervised clustering approach is both accurate and capable of discovering previously unknown application classes. In this paper, we explore the utility of side information in the process of traffic clustering. Specifically, we focus on the flow correlation information that can be efficiently extracted from packet headers and expressed as instance-level constraints, which indicate that particular sets of flows are using the same application and thus should be put into the same cluster. To incorporate the constraints, we propose a modified constrained K-Means algorithm. A variety of real-world traffic traces are used to show that the constraints are widely available. The experimental results indicate that the constrained approach not only improves the quality of the resulted clusters, but also speeds up the convergence of the clustering process.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We present a system to detect parked vehicles in a typical parking complex using multiple streams of images captured through IP connected devices. Compared to traditional object detection techniques and machine learning methods, our approach is significantly faster in detection speed in the presence of multiple image streams. It is also capable of comparable accuracy when put to test against existing methods. And this is achieved without the need to train the system that machine learning methods require. Our approach uses a combination of psychological insights obtained from human detection and an algorithm replicating the outcomes of a SVM learner but without the noise that compromises accuracy in the normal learning process. Performance enhancements are made on the algorithm so that it operates well in the context of multiple image streams. The result is faster detection with comparable accuracy. Our experiments on images captured from a local test site shows very promising results for an implementation that is not only effective and low cost but also opens doors to new parking applications when combined with other technologies.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

One of the issues associated with pattern classification using data based machine learning systems is the “curse of dimensionality”. In this paper, the circle-segments method is proposed as a feature selection method to identify important input features before the entire data set is provided for learning with machine learning systems. Specifically, four machine learning systems are deployed for classification, viz. Multilayer Perceptron (MLP), Support Vector Machine (SVM), Fuzzy ARTMAP (FAM), and k-Nearest Neighbour (kNN). The integration between the circle-segments method and the machine learning systems has been applied to two case studies comprising one benchmark and one real data sets. Overall, the results after feature selection using the circle segments method demonstrate improvements in performance even with more than 50% of the input features eliminated from the original data sets.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Traffic classification has wide applications in network management, from security monitoring to quality of service measurements. Recent research tends to apply machine learning techniques to flow statistical feature based classification methods. The nearest neighbor (NN)-based method has exhibited superior classification performance. It also has several important advantages, such as no requirements of training procedure, no risk of overfitting of parameters, and naturally being able to handle a huge number of classes. However, the performance of NN classifier can be severely affected if the size of training data is small. In this paper, we propose a novel nonparametric approach for traffic classification, which can improve the classification performance effectively by incorporating correlated information into the classification process. We analyze the new classification approach and its performance benefit from both theoretical and empirical perspectives. A large number of experiments are carried out on two real-world traffic data sets to validate the proposed approach. The results show the traffic classification performance can be improved significantly even under the extreme difficult circumstance of very few training samples.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Computational Intelligence (CI) models comprise robust computing methodologies with a high level of machine learning quotient. CI models, in general, are useful for designing computerized intelligent systems/machines that possess useful characteristics mimicking human behaviors and capabilities in solving complex tasks, e.g., learning, adaptation, and evolution. Examples of some popular CI models include fuzzy systems, artificial neural networks, evolutionary algorithms, multi-agent systems, decision trees, rough set theory, knowledge-based systems, and hybrid of these models. This special issue highlights how different computational intelligence models, coupled with other complementary techniques, can be used to handle problems encountered in image processing and information reasoning.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Traffic classification technique is an essential tool for network and system security in the complex environments such as cloud computing based environment. The state-of-the-art traffic classification methods aim to take the advantages of flow statistical features and machine learning techniques, however the classification performance is severely affected by limited supervised information and unknown applications. To achieve effective network traffic classification, we propose a new method to tackle the problem of unknown applications in the crucial situation of a small supervised training set. The proposed method possesses the superior capability of detecting unknown flows generated by unknown applications and utilizing the correlation information among real-world network traffic to boost the classification performance. A theoretical analysis is provided to confirm performance benefit of the proposed method. Moreover, the comprehensive performance evaluation conducted on two real-world network traffic datasets shows that the proposed scheme outperforms the existing methods in the critical network environment.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A fundamental task in pervasive computing is reliable acquisition of contexts from sensor data. This is crucial to the operation of smart pervasive systems and services so that they might behave efficiently and appropriately upon a given context. Simple forms of context can often be extracted directly from raw data. Equally important, or more, is the hidden context and pattern buried inside the data, which is more challenging to discover. Most of existing approaches borrow methods and techniques from machine learning, dominantly employ parametric unsupervised learning and clustering techniques. Being parametric, a severe drawback of these methods is the requirement to specify the number of latent patterns in advance. In this paper, we explore the use of Bayesian nonparametric methods, a recent data modelling framework in machine learning, to infer latent patterns from sensor data acquired in a pervasive setting. Under this formalism, nonparametric prior distributions are used for data generative process, and thus, they allow the number of latent patterns to be learned automatically and grow with the data - as more data comes in, the model complexity can grow to explain new and unseen patterns. In particular, we make use of the hierarchical Dirichlet processes (HDP) to infer atomic activities and interaction patterns from honest signals collected from sociometric badges. We show how data from these sensors can be represented and learned with HDP. We illustrate insights into atomic patterns learned by the model and use them to achieve high-performance clustering. We also demonstrate the framework on the popular Reality Mining dataset, illustrating the ability of the model to automatically infer typical social groups in this dataset. Finally, our framework is generic and applicable to a much wider range of problems in pervasive computing where one needs to infer high-level, latent patterns and contexts from sensor data.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This article describes the utilisation of an unsupervised machine learning technique and statistical approaches (e.g., the Kolmogorov-Smirnov test) that assist cycling experts in the crucial decision-making processes for athlete selection, training, and strategic planning in the track cycling Omnium. The Omnium is a multi-event competition that will be included in the summer Olympic Games for the first time in 2012. Presently, selectors and cycling coaches make decisions based on experience and intuition. They rarely have access to objective data. We analysed both the old five-event (first raced internationally in 2007) and new six-event (first raced internationally in 2011) Omniums and found that the addition of the elimination race component to the Omnium has, contrary to expectations, not favoured track endurance riders. We analysed the Omnium data and also determined the inter-relationships between different individual events as well as between those events and the final standings of riders. In further analysis, we found that there is no maximum ranking (poorest performance) in each individual event that riders can afford whilst still winning a medal. We also found the required times for riders to finish the timed components that are necessary for medal winning. The results of this study consider the scoring system of the Omnium and inform decision-making toward successful participation in future major Omnium competitions.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Creating a set of a number of neural network (NN) models in an ensemble and accumulating them can achieve better overview capability as compared to single neural network. Neural network ensembles are designed to provide solutions to particular problems. Many researchers and academicians have adopted this NN ensemble technique, especially in machine learning, and has been applied in various fields of engineering, medicine and information technology. This paper present a robust aggregation methodology for load demand forecasting based on Bayesian Model Averaging of a set of neural network models in an ensemble. This paper estimate a vector of coefficient for individual NN models' forecasts using validation data-set. These coefficients, also known as weights, are equal to posterior probabilities of the models generating the forecasts. These BMA weights are then used in combining forecasts generated from NN models with test data-set. By comparing the Bayesian results with the Simple Averaging method, it was observed that benefits are obtained by utilizing an advanced method like BMA for forecast combinations.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Social networks have become a convenient and effective means of communication in recent years. Many people use social networks to communicate, lead, and manage activities, and express their opinions in supporting or opposing different causes. This has brought forward the issue of verifying the owners of social accounts, in order to eliminate the effect of any fake accounts on the people. This study aims to authenticate the genuine accounts versus fake account using writeprint, which is the writing style biometric. We first extract a set of features using text mining techniques. Then, training of a supervised machine learning algorithm to build the knowledge base is conducted. The recognition procedure starts by extracting the relevant features and then measuring the similarity of the feature vector with respect to all feature vectors in the knowledge base. Then, the most similar vector is identified as the verified account.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

How to learn an over complete dictionary for sparse representations of image is an important topic in machine learning, sparse coding, blind source separation, etc. The so-called K-singular value decomposition (K-SVD) method [3] is powerful for this purpose, however, it is too time-consuming to apply. Recently, an adaptive orthogonal sparsifying transform (AOST) method has been developed to learn the dictionary that is faster. However, the corresponding coefficient matrix may not be as sparse as that of K-SVD. For solving this problem, in this paper, a non-orthogonal iterative match method is proposed to learn the dictionary. By using the approach of sequentially extracting columns of the stacked image blocks, the non-orthogonal atoms of the dictionary are learned adaptively, and the resultant coefficient matrix is sparser. Experiment results show that the proposed method can yield effective dictionaries and the resulting image representation is sparser than AOST.