781 resultados para Machine Learning Algorithm
Resumo:
Les algorithmes d'apprentissage profond forment un nouvel ensemble de méthodes puissantes pour l'apprentissage automatique. L'idée est de combiner des couches de facteurs latents en hierarchies. Cela requiert souvent un coût computationel plus elevé et augmente aussi le nombre de paramètres du modèle. Ainsi, l'utilisation de ces méthodes sur des problèmes à plus grande échelle demande de réduire leur coût et aussi d'améliorer leur régularisation et leur optimization. Cette thèse adresse cette question sur ces trois perspectives. Nous étudions tout d'abord le problème de réduire le coût de certains algorithmes profonds. Nous proposons deux méthodes pour entrainer des machines de Boltzmann restreintes et des auto-encodeurs débruitants sur des distributions sparses à haute dimension. Ceci est important pour l'application de ces algorithmes pour le traitement de langues naturelles. Ces deux méthodes (Dauphin et al., 2011; Dauphin and Bengio, 2013) utilisent l'échantillonage par importance pour échantilloner l'objectif de ces modèles. Nous observons que cela réduit significativement le temps d'entrainement. L'accéleration atteint 2 ordres de magnitude sur plusieurs bancs d'essai. Deuxièmement, nous introduisont un puissant régularisateur pour les méthodes profondes. Les résultats expérimentaux démontrent qu'un bon régularisateur est crucial pour obtenir de bonnes performances avec des gros réseaux (Hinton et al., 2012). Dans Rifai et al. (2011), nous proposons un nouveau régularisateur qui combine l'apprentissage non-supervisé et la propagation de tangente (Simard et al., 1992). Cette méthode exploite des principes géometriques et permit au moment de la publication d'atteindre des résultats à l'état de l'art. Finalement, nous considérons le problème d'optimiser des surfaces non-convexes à haute dimensionalité comme celle des réseaux de neurones. Tradionellement, l'abondance de minimum locaux était considéré comme la principale difficulté dans ces problèmes. Dans Dauphin et al. (2014a) nous argumentons à partir de résultats en statistique physique, de la théorie des matrices aléatoires, de la théorie des réseaux de neurones et à partir de résultats expérimentaux qu'une difficulté plus profonde provient de la prolifération de points-selle. Dans ce papier nous proposons aussi une nouvelle méthode pour l'optimisation non-convexe.
Resumo:
In this paper, a new methodology for the prediction of scoliosis curve types from non invasive acquisitions of the back surface of the trunk is proposed. One hundred and fifty-nine scoliosis patients had their back surface acquired in 3D using an optical digitizer. Each surface is then characterized by 45 local measurements of the back surface rotation. Using a semi-supervised algorithm, the classifier is trained with only 32 labeled and 58 unlabeled data. Tested on 69 new samples, the classifier succeeded in classifying correctly 87.0% of the data. After reducing the number of labeled training samples to 12, the behavior of the resulting classifier tends to be similar to the reference case where the classifier is trained only with the maximum number of available labeled data. Moreover, the addition of unlabeled data guided the classifier towards more generalizable boundaries between the classes. Those results provide a proof of feasibility for using a semi-supervised learning algorithm to train a classifier for the prediction of a scoliosis curve type, when only a few training data are labeled. This constitutes a promising clinical finding since it will allow the diagnosis and the follow-up of scoliotic deformities without exposing the patient to X-ray radiations.
Resumo:
In our study we use a kernel based classification technique, Support Vector Machine Regression for predicting the Melting Point of Drug – like compounds in terms of Topological Descriptors, Topological Charge Indices, Connectivity Indices and 2D Auto Correlations. The Machine Learning model was designed, trained and tested using a dataset of 100 compounds and it was found that an SVMReg model with RBF Kernel could predict the Melting Point with a mean absolute error 15.5854 and Root Mean Squared Error 19.7576
Resumo:
This paper presents a Reinforcement Learning (RL) approach to economic dispatch (ED) using Radial Basis Function neural network. We formulate the ED as an N stage decision making problem. We propose a novel architecture to store Qvalues and present a learning algorithm to learn the weights of the neural network. Even though many stochastic search techniques like simulated annealing, genetic algorithm and evolutionary programming have been applied to ED, they require searching for the optimal solution for each load demand. Also they find limitation in handling stochastic cost functions. In our approach once we learn the Q-values, we can find the dispatch for any load demand. We have recently proposed a RL approach to ED. In that approach, we could find only the optimum dispatch for a set of specified discrete values of power demand. The performance of the proposed algorithm is validated by taking IEEE 6 bus system, considering transmission losses
Resumo:
Knowledge discovery in databases is the non-trivial process of identifying valid, novel potentially useful and ultimately understandable patterns from data. The term Data mining refers to the process which does the exploratory analysis on the data and builds some model on the data. To infer patterns from data, data mining involves different approaches like association rule mining, classification techniques or clustering techniques. Among the many data mining techniques, clustering plays a major role, since it helps to group the related data for assessing properties and drawing conclusions. Most of the clustering algorithms act on a dataset with uniform format, since the similarity or dissimilarity between the data points is a significant factor in finding out the clusters. If a dataset consists of mixed attributes, i.e. a combination of numerical and categorical variables, a preferred approach is to convert different formats into a uniform format. The research study explores the various techniques to convert the mixed data sets to a numerical equivalent, so as to make it equipped for applying the statistical and similar algorithms. The results of clustering mixed category data after conversion to numeric data type have been demonstrated using a crime data set. The thesis also proposes an extension to the well known algorithm for handling mixed data types, to deal with data sets having only categorical data. The proposed conversion has been validated on a data set corresponding to breast cancer. Moreover, another issue with the clustering process is the visualization of output. Different geometric techniques like scatter plot, or projection plots are available, but none of the techniques display the result projecting the whole database but rather demonstrate attribute-pair wise analysis
Resumo:
We describe an adaptive, mid-level approach to the wireless device power management problem. Our approach is based on reinforcement learning, a machine learning framework for autonomous agents. We describe how our framework can be applied to the power management problem in both infrastructure and ad~hoc wireless networks. From this thesis we conclude that mid-level power management policies can outperform low-level policies and are more convenient to implement than high-level policies. We also conclude that power management policies need to adapt to the user and network, and that a mid-level power management framework based on reinforcement learning fulfills these requirements.
Resumo:
If we are to understand how we can build machines capable of broad purpose learning and reasoning, we must first aim to build systems that can represent, acquire, and reason about the kinds of commonsense knowledge that we humans have about the world. This endeavor suggests steps such as identifying the kinds of knowledge people commonly have about the world, constructing suitable knowledge representations, and exploring the mechanisms that people use to make judgments about the everyday world. In this work, I contribute to these goals by proposing an architecture for a system that can learn commonsense knowledge about the properties and behavior of objects in the world. The architecture described here augments previous machine learning systems in four ways: (1) it relies on a seven dimensional notion of context, built from information recently given to the system, to learn and reason about objects' properties; (2) it has multiple methods that it can use to reason about objects, so that when one method fails, it can fall back on others; (3) it illustrates the usefulness of reasoning about objects by thinking about their similarity to other, better known objects, and by inferring properties of objects from the categories that they belong to; and (4) it represents an attempt to build an autonomous learner and reasoner, that sets its own goals for learning about the world and deduces new facts by reflecting on its acquired knowledge. This thesis describes this architecture, as well as a first implementation, that can learn from sentences such as ``A blue bird flew to the tree'' and ``The small bird flew to the cage'' that birds can fly. One of the main contributions of this work lies in suggesting a further set of salient ideas about how we can build broader purpose commonsense artificial learners and reasoners.
Resumo:
The Support Vector (SV) machine is a novel type of learning machine, based on statistical learning theory, which contains polynomial classifiers, neural networks, and radial basis function (RBF) networks as special cases. In the RBF case, the SV algorithm automatically determines centers, weights and threshold such as to minimize an upper bound on the expected test error. The present study is devoted to an experimental comparison of these machines with a classical approach, where the centers are determined by $k$--means clustering and the weights are found using error backpropagation. We consider three machines, namely a classical RBF machine, an SV machine with Gaussian kernel, and a hybrid system with the centers determined by the SV method and the weights trained by error backpropagation. Our results show that on the US postal service database of handwritten digits, the SV machine achieves the highest test accuracy, followed by the hybrid approach. The SV approach is thus not only theoretically well--founded, but also superior in a practical application.
Resumo:
We present a framework for learning in hidden Markov models with distributed state representations. Within this framework, we derive a learning algorithm based on the Expectation--Maximization (EM) procedure for maximum likelihood estimation. Analogous to the standard Baum-Welch update rules, the M-step of our algorithm is exact and can be solved analytically. However, due to the combinatorial nature of the hidden state representation, the exact E-step is intractable. A simple and tractable mean field approximation is derived. Empirical results on a set of problems suggest that both the mean field approximation and Gibbs sampling are viable alternatives to the computationally expensive exact algorithm.
Resumo:
We present a tree-structured architecture for supervised learning. The statistical model underlying the architecture is a hierarchical mixture model in which both the mixture coefficients and the mixture components are generalized linear models (GLIM's). Learning is treated as a maximum likelihood problem; in particular, we present an Expectation-Maximization (EM) algorithm for adjusting the parameters of the architecture. We also develop an on-line learning algorithm in which the parameters are updated incrementally. Comparative simulation results are presented in the robot dynamics domain.
Resumo:
Stock markets employ specialized traders, market-makers, designed to provide liquidity and volume to the market by constantly supplying both supply and demand. In this paper, we demonstrate a novel method for modeling the market as a dynamic system and a reinforcement learning algorithm that learns profitable market-making strategies when run on this model. The sequence of buys and sells for a particular stock, the order flow, we model as an Input-Output Hidden Markov Model fit to historical data. When combined with the dynamics of the order book, this creates a highly non-linear and difficult dynamic system. Our reinforcement learning algorithm, based on likelihood ratios, is run on this partially-observable environment. We demonstrate learning results for two separate real stocks.
Resumo:
The Support Vector Machine (SVM) is a new and very promising classification technique developed by Vapnik and his group at AT&T Bell Labs. This new learning algorithm can be seen as an alternative training technique for Polynomial, Radial Basis Function and Multi-Layer Perceptron classifiers. An interesting property of this approach is that it is an approximate implementation of the Structural Risk Minimization (SRM) induction principle. The derivation of Support Vector Machines, its relationship with SRM, and its geometrical insight, are discussed in this paper. Training a SVM is equivalent to solve a quadratic programming problem with linear and box constraints in a number of variables equal to the number of data points. When the number of data points exceeds few thousands the problem is very challenging, because the quadratic form is completely dense, so the memory needed to store the problem grows with the square of the number of data points. Therefore, training problems arising in some real applications with large data sets are impossible to load into memory, and cannot be solved using standard non-linear constrained optimization algorithms. We present a decomposition algorithm that can be used to train SVM's over large data sets. The main idea behind the decomposition is the iterative solution of sub-problems and the evaluation of, and also establish the stopping criteria for the algorithm. We present previous approaches, as well as results and important details of our implementation of the algorithm using a second-order variant of the Reduced Gradient Method as the solver of the sub-problems. As an application of SVM's, we present preliminary results we obtained applying SVM to the problem of detecting frontal human faces in real images.
Resumo:
Our work is focused on alleviating the workload for designers of adaptive courses on the complexity task of authoring adaptive learning designs adjusted to specific user characteristics and the user context. We propose an adaptation platform that consists in a set of intelligent agents where each agent carries out an independent adaptation task. The agents apply machine learning techniques to support the user modelling for the adaptation process
Resumo:
Emotions are crucial for user's decision making in recommendation processes. We first introduce ambient recommender systems, which arise from the analysis of new trends on the exploitation of the emotional context in the next generation of recommender systems. We then explain some results of these new trends in real-world applications through the smart prediction assistant (SPA) platform in an intelligent learning guide with more than three million users. While most approaches to recommending have focused on algorithm performance. SPA makes recommendations to users on the basis of emotional information acquired in an incremental way. This article provides a cross-disciplinary perspective to achieve this goal in such recommender systems through a SPA platform. The methodology applied in SPA is the result of a bunch of technology transfer projects for large real-world rccommender systems
Resumo:
Self-Organizing Map (SOM) algorithm has been extensively used for analysis and classification problems. For this kind of problems, datasets become more and more large and it is necessary to speed up the SOM learning. In this paper we present an application of the Simulated Annealing (SA) procedure to the SOM learning algorithm. The goal of the algorithm is to obtain fast learning and better performance in terms of matching of input data and regularity of the obtained map. An advantage of the proposed technique is that it preserves the simplicity of the basic algorithm. Several tests, carried out on different large datasets, demonstrate the effectiveness of the proposed algorithm in comparison with the original SOM and with some of its modification introduced to speed-up the learning.