957 resultados para machine learning


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The main purpose of this thesis project is to prediction of symptom severity and cause in data from test battery of the Parkinson’s disease patient, which is based on data mining. The collection of the data is from test battery on a hand in computer. We use the Chi-Square method and check which variables are important and which are not important. Then we apply different data mining techniques on our normalize data and check which technique or method gives good results.The implementation of this thesis is in WEKA. We normalize our data and then apply different methods on this data. The methods which we used are Naïve Bayes, CART and KNN. We draw the Bland Altman and Spearman’s Correlation for checking the final results and prediction of data. The Bland Altman tells how the percentage of our confident level in this data is correct and Spearman’s Correlation tells us our relationship is strong. On the basis of results and analysis we see all three methods give nearly same results. But if we see our CART (J48 Decision Tree) it gives good result of under predicted and over predicted values that’s lies between -2 to +2. The correlation between the Actual and Predicted values is 0,794in CART. Cause gives the better percentage classification result then disability because it can use two classes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In a global economy, manufacturers mainly compete with cost efficiency of production, as the price of raw materials are similar worldwide. Heavy industry has two big issues to deal with. On the one hand there is lots of data which needs to be analyzed in an effective manner, and on the other hand making big improvements via investments in cooperate structure or new machinery is neither economically nor physically viable. Machine learning offers a promising way for manufacturers to address both these problems as they are in an excellent position to employ learning techniques with their massive resource of historical production data. However, choosing modelling a strategy in this setting is far from trivial and this is the objective of this article. The article investigates characteristics of the most popular classifiers used in industry today. Support Vector Machines, Multilayer Perceptron, Decision Trees, Random Forests, and the meta-algorithms Bagging and Boosting are mainly investigated in this work. Lessons from real-world implementations of these learners are also provided together with future directions when different learners are expected to perform well. The importance of feature selection and relevant selection methods in an industrial setting are further investigated. Performance metrics have also been discussed for the sake of completion.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

At first blush, user modeling appears to be a prime candidate for straightforward application of standard machine learning techniques. Observations of the user's behavior can provide training examples that a machine learning system can use to form a model designed to predict future actions. However, user modeling poses a number of challenges for machine learning that have hindered its application in user modeling, including: the need for large data sets; the need for labeled data; concept drift; and computational complexity. This paper examines each of these issues and reviews approaches to resolving them.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Spam is commonly defined as unsolicited email messages and the goal of spam filtering is to distinguish between spam and legitimate email messages. Much work has been done to filter spam from legitimate emails using machine learning algorithm and substantial performance has been achieved with some amount of false positive (FP) tradeoffs. In the case of spam detection FP problem is unacceptable sometimes. In this paper, an adaptive spam filtering model has been proposed based on Machine learning (ML) algorithms which will get better accuracy by reducing FP problems. This model consists of individual and combined filtering approach from existing well known ML algorithms. The proposed model considers both individual and collective output and analyzes them by an analyzer. A dynamic feature selection (DFS) technique also proposed in this paper for getting better accuracy.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Developing successful navigation and mapping strategies is an essential part of autonomous robot research. However, hardware limitations often make for inaccurate systems. This project serves to investigate efficient alternatives to mapping an environment, by first creating a mobile robot, and then applying machine learning to the robot and controlling systems to increase the robustness of the robot system. My mapping system consists of a semi-autonomous robot drone in communication with a stationary Linux computer system. There are learning systems running on both the robot and the more powerful Linux system. The first stage of this project was devoted to designing and building an inexpensive robot. Utilizing my prior experience from independent studies in robotics, I designed a small mobile robot that was well suited for simple navigation and mapping research. When the major components of the robot base were designed, I began to implement my design. This involved physically constructing the base of the robot, as well as researching and acquiring components such as sensors. Implementing the more complex sensors became a time-consuming task, involving much research and assistance from a variety of sources. A concurrent stage of the project involved researching and experimenting with different types of machine learning systems. I finally settled on using neural networks as the machine learning system to incorporate into my project. Neural nets can be thought of as a structure of interconnected nodes, through which information filters. The type of neural net that I chose to use is a type that requires a known set of data that serves to train the net to produce the desired output. Neural nets are particularly well suited for use with robotic systems as they can handle cases that lie at the extreme edges of the training set, such as may be produced by "noisy" sensor data. Through experimenting with available neural net code, I became familiar with the code and its function, and modified it to be more generic and reusable for multiple applications of neural nets.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Spam is commonly known as unsolicited or unwanted email messages in the Internet causing potential threat to Internet Security. Users spend a valuable amount of time deleting spam emails. More importantly, ever increasing spam emails occupy server storage space and consume network bandwidth. Keyword-based spam email filtering strategies will eventually be less successful to model spammer behavior as the spammer constantly changes their tricks to circumvent these filters. The evasive tactics that the spammer uses are patterns and these patterns can be modeled to combat spam. This paper investigates the possibilities of modeling spammer behavioral patterns by well-known classification algorithms such as Naïve Bayesian classifier (Naive Bayes), Decision Tree Induction (DTI) and Support Vector Machines (SVMs). Preliminary experimental results demonstrate a promising detection rate of around 92%, which is considerably an enhancement of performance compared to similar spammer behavior modeling research.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Due to the increasing unreliability of traditional port-based methods, Internet traffic classification has attracted a lot of research efforts in recent years. Quite a lot of previous papers have focused on using statistical characteristics as discriminators and applying machine learning techniques to classify the traffic flows. In this paper, we propose a novel machine learning based approach where the features are extracted from packet payload instead of flow statistics. Specifically, every flow is represented by a feature vector, in which each item indicates the occurrence of a particular token, i.e.; a common substring, in the payload. We have applied various machine learning algorithms to evaluate the idea and used different feature selection schemes to identify the critical tokens. Experimental result based on a real-world traffic data set shows that the approach can achieve high accuracy with low overhead.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents an application of machine learning to the problem of classifying patients with glaucoma into one of two classes:stable and progressive glaucoma. The novelty of the work is the use of new features for the data analysis combined with machine learning techniques to classify the medical data. The paper describes the new features and the results of using decision trees to separate stable and progressive cases. Furthermore, we show the results of using an incremental learning algorithm for tracking stable and progressive cases over time. In both cases we used a dataset of progressive and stable glaucoma patients obtained from a glaucoma clinic.