178 resultados para Machine learning

em Deakin Research Online - Australia


Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

At first blush, user modeling appears to be a prime candidate for straightforward application of standard machine learning techniques. Observations of the user's behavior can provide training examples that a machine learning system can use to form a model designed to predict future actions. However, user modeling poses a number of challenges for machine learning that have hindered its application in user modeling, including: the need for large data sets; the need for labeled data; concept drift; and computational complexity. This paper examines each of these issues and reviews approaches to resolving them.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Spam is commonly defined as unsolicited email messages and the goal of spam filtering is to distinguish between spam and legitimate email messages. Much work has been done to filter spam from legitimate emails using machine learning algorithm and substantial performance has been achieved with some amount of false positive (FP) tradeoffs. In the case of spam detection FP problem is unacceptable sometimes. In this paper, an adaptive spam filtering model has been proposed based on Machine learning (ML) algorithms which will get better accuracy by reducing FP problems. This model consists of individual and combined filtering approach from existing well known ML algorithms. The proposed model considers both individual and collective output and analyzes them by an analyzer. A dynamic feature selection (DFS) technique also proposed in this paper for getting better accuracy.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Spam is commonly known as unsolicited or unwanted email messages in the Internet causing potential threat to Internet Security. Users spend a valuable amount of time deleting spam emails. More importantly, ever increasing spam emails occupy server storage space and consume network bandwidth. Keyword-based spam email filtering strategies will eventually be less successful to model spammer behavior as the spammer constantly changes their tricks to circumvent these filters. The evasive tactics that the spammer uses are patterns and these patterns can be modeled to combat spam. This paper investigates the possibilities of modeling spammer behavioral patterns by well-known classification algorithms such as Naïve Bayesian classifier (Naive Bayes), Decision Tree Induction (DTI) and Support Vector Machines (SVMs). Preliminary experimental results demonstrate a promising detection rate of around 92%, which is considerably an enhancement of performance compared to similar spammer behavior modeling research.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Due to the increasing unreliability of traditional port-based methods, Internet traffic classification has attracted a lot of research efforts in recent years. Quite a lot of previous papers have focused on using statistical characteristics as discriminators and applying machine learning techniques to classify the traffic flows. In this paper, we propose a novel machine learning based approach where the features are extracted from packet payload instead of flow statistics. Specifically, every flow is represented by a feature vector, in which each item indicates the occurrence of a particular token, i.e.; a common substring, in the payload. We have applied various machine learning algorithms to evaluate the idea and used different feature selection schemes to identify the critical tokens. Experimental result based on a real-world traffic data set shows that the approach can achieve high accuracy with low overhead.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents an application of machine learning to the problem of classifying patients with glaucoma into one of two classes:stable and progressive glaucoma. The novelty of the work is the use of new features for the data analysis combined with machine learning techniques to classify the medical data. The paper describes the new features and the results of using decision trees to separate stable and progressive cases. Furthermore, we show the results of using an incremental learning algorithm for tracking stable and progressive cases over time. In both cases we used a dataset of progressive and stable glaucoma patients obtained from a glaucoma clinic.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Graph matching is an important class of methods in pattern recognition. Typically, a graph representing an unknown pattern is matched with a database of models. If the database of model graphs is large, an additional factor in induced into the overall complexity of the matching process. Various techniques for reducing the influence of this additional factor have been described in the literature. In this paper we propose to extract simple features from a graph and use them to eliminate candidate graphs from the database. The most powerful set of features and a decision tree useful for candidate elimination are found by means of the C4.5 algorithm, which was originally proposed for inductive learning of classication rules. Experimental results are reported demonstrating that effcient candidate elimination can be achieved by the proposed procedure.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Sleep stage identification is the first step in modern sleep disorder diagnostics process. K-complex is an indicator for the sleep stage 2. However, due to the ambiguity of the translation of the medical standards into a computer-based procedure, reliability of automated K-complex detection from the EEG wave is still far from expectation. More specifically, there are some significant barriers to the research of automatic K-complex detection. First, there is no adequate description of K-complex that makes it difficult to develop automatic detection algorithm. Second, human experts only provided the label for whether a whole EEG segment contains K-complex or not, rather than individual labels for each subsegment. These barriers render most pattern recognition algorithms inapplicable in detecting K-complex. In this paper, we attempt to address these two challenges, by designing a new feature extraction method that can transform visual features of the EEG wave with any length into mathematical representation and proposing a hybrid-synergic machine learning method to build a K-complex classifier. The tenfold cross-validation results indicate that both the accuracy and the precision of this proposed model are at least as good as a human expert in K-complex detection.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents work on using Machine Learning approaches for predicting performance patterns of medalists in Track Cycling Omnium championships. The omnium is a newly introduced track cycling competition to be included in the London 2012 Olympic Games. It involves six individual events and, therefore, requires strategic planning for riders and coaches to achieve the best overall standing in terms of the ranking, speed, and time in each individual component. We carried out unsupervised, supervised, and statistical analyses on the men’s and women’s historical competition data in the World Championships since 2008 to find winning patterns for each gender in terms of the ranking of riders in each individual event. Our results demonstrate that both sprint and endurance capacities are required for both men and women to win a medal in the omnium. Sprint ability is shown to have slightly more influence in deciding the medalists of the omnium competitions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article describes the implementation of machine learning techniques that assist cycling experts in the crucial decision-making processes for athlete selection and strategic planning in the track cycling omnium. The omnium is a multi-event competition that was included in the Olympic Games for the first time in 2012. Presently, selectors and cycling coaches make decisions based on experience and opinion. They rarely have access to knowledge that helps predict athletic performances. The omnium presents a unique and complex decision-making challenge as it is not clear what type of athlete is best suited to the omnium (e.g., sprint or endurance specialist) and tactical decisions made by the coach and athlete during the event will have significant effects on the overall performance of the athlete. In the present work, a variety of machine learning techniques were used to analyze omnium competition data from the World Championships since 2007. The analysis indicates that sprint events have slightly more influence in determining the medalists, than endurance-based events. Using a probabilistic analysis, we created a model of performance prediction that provides an unprecedented level of supporting information that assists coaches with strategic and tactical decisions during the omnium.