781 resultados para Machine Learning Algorithm
Resumo:
Machine learning techniques have been recognized as powerful tools for learning from data. One of the most popular learning techniques, the Back-Propagation (BP) Artificial Neural Networks, can be used as a computer model to predict peptides binding to the Human Leukocyte Antigens (HLA). The major advantage of computational screening is that it reduces the number of wet-lab experiments that need to be performed, significantly reducing the cost and time. A recently developed method, Extreme Learning Machine (ELM), which has superior properties over BP has been investigated to accomplish such tasks. In our work, we found that the ELM is as good as, if not better than, the BP in term of time complexity, accuracy deviations across experiments, and most importantly - prevention from over-fitting for prediction of peptide binding to HLA.
Resumo:
The Vapnik-Chervonenkis (VC) dimension is a combinatorial measure of a certain class of machine learning problems, which may be used to obtain upper and lower bounds on the number of training examples needed to learn to prescribed levels of accuracy. Most of the known bounds apply to the Probably Approximately Correct (PAC) framework, which is the framework within which we work in this paper. For a learning problem with some known VC dimension, much is known about the order of growth of the sample-size requirement of the problem, as a function of the PAC parameters. The exact value of sample-size requirement is however less well-known, and depends heavily on the particular learning algorithm being used. This is a major obstacle to the practical application of the VC dimension. Hence it is important to know exactly how the sample-size requirement depends on VC dimension, and with that in mind, we describe a general algorithm for learning problems having VC dimension 1. Its sample-size requirement is minimal (as a function of the PAC parameters), and turns out to be the same for all non-trivial learning problems having VC dimension 1. While the method used cannot be naively generalised to higher VC dimension, it suggests that optimal algorithm-dependent bounds may improve substantially on current upper bounds.
Resumo:
A theoretical model is presented which describes selection in a genetic algorithm (GA) under a stochastic fitness measure and correctly accounts for finite population effects. Although this model describes a number of selection schemes, we only consider Boltzmann selection in detail here as results for this form of selection are particularly transparent when fitness is corrupted by additive Gaussian noise. Finite population effects are shown to be of fundamental importance in this case, as the noise has no effect in the infinite population limit. In the limit of weak selection we show how the effects of any Gaussian noise can be removed by increasing the population size appropriately. The theory is tested on two closely related problems: the one-max problem corrupted by Gaussian noise and generalization in a perceptron with binary weights. The averaged dynamics can be accurately modelled for both problems using a formalism which describes the dynamics of the GA using methods from statistical mechanics. The second problem is a simple example of a learning problem and by considering this problem we show how the accurate characterization of noise in the fitness evaluation may be relevant in machine learning. The training error (negative fitness) is the number of misclassified training examples in a batch and can be considered as a noisy version of the generalization error if an independent batch is used for each evaluation. The noise is due to the finite batch size and in the limit of large problem size and weak selection we show how the effect of this noise can be removed by increasing the population size. This allows the optimal batch size to be determined, which minimizes computation time as well as the total number of training examples required.
Resumo:
We analyse the dynamics of a number of second order on-line learning algorithms training multi-layer neural networks, using the methods of statistical mechanics. We first consider on-line Newton's method, which is known to provide optimal asymptotic performance. We determine the asymptotic generalization error decay for a soft committee machine, which is shown to compare favourably with the result for standard gradient descent. Matrix momentum provides a practical approximation to this method by allowing an efficient inversion of the Hessian. We consider an idealized matrix momentum algorithm which requires access to the Hessian and find close correspondence with the dynamics of on-line Newton's method. In practice, the Hessian will not be known on-line and we therefore consider matrix momentum using a single example approximation to the Hessian. In this case good asymptotic performance may still be achieved, but the algorithm is now sensitive to parameter choice because of noise in the Hessian estimate. On-line Newton's method is not appropriate during the transient learning phase, since a suboptimal unstable fixed point of the gradient descent dynamics becomes stable for this algorithm. A principled alternative is to use Amari's natural gradient learning algorithm and we show how this method provides a significant reduction in learning time when compared to gradient descent, while retaining the asymptotic performance of on-line Newton's method.
Resumo:
An approach is proposed for inferring implicative logical rules from examples. The concept of a good diagnostic test for a given set of positive examples lies in the basis of this approach. The process of inferring good diagnostic tests is considered as a process of inductive common sense reasoning. The incremental approach to learning algorithms is implemented in an algorithm DIAGaRa for inferring implicative rules from examples.
Resumo:
Report published in the Proceedings of the National Conference on "Education in the Information Society", Plovdiv, May, 2013
Resumo:
Lifelong surveillance is not cost-effective after endovascular aneurysm repair (EVAR), but is required to detect aortic complications which are fatal if untreated (type 1/3 endoleak, sac expansion, device migration). Aneurysm morphology determines the probability of aortic complications and therefore the need for surveillance, but existing analyses have proven incapable of identifying patients at sufficiently low risk to justify abandoning surveillance. This study aimed to improve the prediction of aortic complications, through the application of machine-learning techniques. Patients undergoing EVAR at 2 centres were studied from 2004–2010. Aneurysm morphology had previously been studied to derive the SGVI Score for predicting aortic complications. Bayesian Neural Networks were designed using the same data, to dichotomise patients into groups at low- or high-risk of aortic complications. Network training was performed only on patients treated at centre 1. External validation was performed by assessing network performance independently of network training, on patients treated at centre 2. Discrimination was assessed by Kaplan-Meier analysis to compare aortic complications in predicted low-risk versus predicted high-risk patients. 761 patients aged 75 +/− 7 years underwent EVAR in 2 centres. Mean follow-up was 36+/− 20 months. Neural networks were created incorporating neck angu- lation/length/diameter/volume; AAA diameter/area/volume/length/tortuosity; and common iliac tortuosity/diameter. A 19-feature network predicted aor- tic complications with excellent discrimination and external validation (5-year freedom from aortic complications in predicted low-risk vs predicted high-risk patients: 97.9% vs. 63%; p < 0.0001). A Bayesian Neural-Network algorithm can identify patients in whom it may be safe to abandon surveillance after EVAR. This proposal requires prospective study.
Resumo:
This dissertation investigates the connection between spectral analysis and frame theory. When considering the spectral properties of a frame, we present a few novel results relating to the spectral decomposition. We first show that scalable frames have the property that the inner product of the scaling coefficients and the eigenvectors must equal the inverse eigenvalues. From this, we prove a similar result when an approximate scaling is obtained. We then focus on the optimization problems inherent to the scalable frames by first showing that there is an equivalence between scaling a frame and optimization problems with a non-restrictive objective function. Various objective functions are considered, and an analysis of the solution type is presented. For linear objectives, we can encourage sparse scalings, and with barrier objective functions, we force dense solutions. We further consider frames in high dimensions, and derive various solution techniques. From here, we restrict ourselves to various frame classes, to add more specificity to the results. Using frames generated from distributions allows for the placement of probabilistic bounds on scalability. For discrete distributions (Bernoulli and Rademacher), we bound the probability of encountering an ONB, and for continuous symmetric distributions (Uniform and Gaussian), we show that symmetry is retained in the transformed domain. We also prove several hyperplane-separation results. With the theory developed, we discuss graph applications of the scalability framework. We make a connection with graph conditioning, and show the in-feasibility of the problem in the general case. After a modification, we show that any complete graph can be conditioned. We then present a modification of standard PCA (robust PCA) developed by Cand\`es, and give some background into Electron Energy-Loss Spectroscopy (EELS). We design a novel scheme for the processing of EELS through robust PCA and least-squares regression, and test this scheme on biological samples. Finally, we take the idea of robust PCA and apply the technique of kernel PCA to perform robust manifold learning. We derive the problem and present an algorithm for its solution. There is also discussion of the differences with RPCA that make theoretical guarantees difficult.
Resumo:
Les métaheuristiques sont très utilisées dans le domaine de l'optimisation discrète. Elles permettent d’obtenir une solution de bonne qualité en un temps raisonnable, pour des problèmes qui sont de grande taille, complexes, et difficiles à résoudre. Souvent, les métaheuristiques ont beaucoup de paramètres que l’utilisateur doit ajuster manuellement pour un problème donné. L'objectif d'une métaheuristique adaptative est de permettre l'ajustement automatique de certains paramètres par la méthode, en se basant sur l’instance à résoudre. La métaheuristique adaptative, en utilisant les connaissances préalables dans la compréhension du problème, des notions de l'apprentissage machine et des domaines associés, crée une méthode plus générale et automatique pour résoudre des problèmes. L’optimisation globale des complexes miniers vise à établir les mouvements des matériaux dans les mines et les flux de traitement afin de maximiser la valeur économique du système. Souvent, en raison du grand nombre de variables entières dans le modèle, de la présence de contraintes complexes et de contraintes non-linéaires, il devient prohibitif de résoudre ces modèles en utilisant les optimiseurs disponibles dans l’industrie. Par conséquent, les métaheuristiques sont souvent utilisées pour l’optimisation de complexes miniers. Ce mémoire améliore un procédé de recuit simulé développé par Goodfellow & Dimitrakopoulos (2016) pour l’optimisation stochastique des complexes miniers stochastiques. La méthode développée par les auteurs nécessite beaucoup de paramètres pour fonctionner. Un de ceux-ci est de savoir comment la méthode de recuit simulé cherche dans le voisinage local de solutions. Ce mémoire implémente une méthode adaptative de recherche dans le voisinage pour améliorer la qualité d'une solution. Les résultats numériques montrent une augmentation jusqu'à 10% de la valeur de la fonction économique.
Resumo:
Les métaheuristiques sont très utilisées dans le domaine de l'optimisation discrète. Elles permettent d’obtenir une solution de bonne qualité en un temps raisonnable, pour des problèmes qui sont de grande taille, complexes, et difficiles à résoudre. Souvent, les métaheuristiques ont beaucoup de paramètres que l’utilisateur doit ajuster manuellement pour un problème donné. L'objectif d'une métaheuristique adaptative est de permettre l'ajustement automatique de certains paramètres par la méthode, en se basant sur l’instance à résoudre. La métaheuristique adaptative, en utilisant les connaissances préalables dans la compréhension du problème, des notions de l'apprentissage machine et des domaines associés, crée une méthode plus générale et automatique pour résoudre des problèmes. L’optimisation globale des complexes miniers vise à établir les mouvements des matériaux dans les mines et les flux de traitement afin de maximiser la valeur économique du système. Souvent, en raison du grand nombre de variables entières dans le modèle, de la présence de contraintes complexes et de contraintes non-linéaires, il devient prohibitif de résoudre ces modèles en utilisant les optimiseurs disponibles dans l’industrie. Par conséquent, les métaheuristiques sont souvent utilisées pour l’optimisation de complexes miniers. Ce mémoire améliore un procédé de recuit simulé développé par Goodfellow & Dimitrakopoulos (2016) pour l’optimisation stochastique des complexes miniers stochastiques. La méthode développée par les auteurs nécessite beaucoup de paramètres pour fonctionner. Un de ceux-ci est de savoir comment la méthode de recuit simulé cherche dans le voisinage local de solutions. Ce mémoire implémente une méthode adaptative de recherche dans le voisinage pour améliorer la qualité d'une solution. Les résultats numériques montrent une augmentation jusqu'à 10% de la valeur de la fonction économique.
Resumo:
Acoustic Emission (AE) monitoring can be used to detect the presence of damage as well as determine its location in Structural Health Monitoring (SHM) applications. Information on the time difference of the signal generated by the damage event arriving at different sensors is essential in performing localization. This makes the time of arrival (ToA) an important piece of information to retrieve from the AE signal. Generally, this is determined using statistical methods such as the Akaike Information Criterion (AIC) which is particularly prone to errors in the presence of noise. And given that the structures of interest are surrounded with harsh environments, a way to accurately estimate the arrival time in such noisy scenarios is of particular interest. In this work, two new methods are presented to estimate the arrival times of AE signals which are based on Machine Learning. Inspired by great results in the field, two models are presented which are Deep Learning models - a subset of machine learning. They are based on Convolutional Neural Network (CNN) and Capsule Neural Network (CapsNet). The primary advantage of such models is that they do not require the user to pre-define selected features but only require raw data to be given and the models establish non-linear relationships between the inputs and outputs. The performance of the models is evaluated using AE signals generated by a custom ray-tracing algorithm by propagating them on an aluminium plate and compared to AIC. It was found that the relative error in estimation on the test set was < 5% for the models compared to around 45% of AIC. The testing process was further continued by preparing an experimental setup and acquiring real AE signals to test on. Similar performances were observed where the two models not only outperform AIC by more than a magnitude in their average errors but also they were shown to be a lot more robust as compared to AIC which fails in the presence of noise.
Resumo:
Although the debate of what data science is has a long history and has not reached a complete consensus yet, Data Science can be summarized as the process of learning from data. Guided by the above vision, this thesis presents two independent data science projects developed in the scope of multidisciplinary applied research. The first part analyzes fluorescence microscopy images typically produced in life science experiments, where the objective is to count how many marked neuronal cells are present in each image. Aiming to automate the task for supporting research in the area, we propose a neural network architecture tuned specifically for this use case, cell ResUnet (c-ResUnet), and discuss the impact of alternative training strategies in overcoming particular challenges of our data. The approach provides good results in terms of both detection and counting, showing performance comparable to the interpretation of human operators. As a meaningful addition, we release the pre-trained model and the Fluorescent Neuronal Cells dataset collecting pixel-level annotations of where neuronal cells are located. In this way, we hope to help future research in the area and foster innovative methodologies for tackling similar problems. The second part deals with the problem of distributed data management in the context of LHC experiments, with a focus on supporting ATLAS operations concerning data transfer failures. In particular, we analyze error messages produced by failed transfers and propose a Machine Learning pipeline that leverages the word2vec language model and K-means clustering. This provides groups of similar errors that are presented to human operators as suggestions of potential issues to investigate. The approach is demonstrated on one full day of data, showing promising ability in understanding the message content and providing meaningful groupings, in line with previously reported incidents by human operators.
Resumo:
Machine Learning makes computers capable of performing tasks typically requiring human intelligence. A domain where it is having a considerable impact is the life sciences, allowing to devise new biological analysis protocols, develop patients’ treatments efficiently and faster, and reduce healthcare costs. This Thesis work presents new Machine Learning methods and pipelines for the life sciences focusing on the unsupervised field. At a methodological level, two methods are presented. The first is an “Ab Initio Local Principal Path” and it is a revised and improved version of a pre-existing algorithm in the manifold learning realm. The second contribution is an improvement over the Import Vector Domain Description (one-class learning) through the Kullback-Leibler divergence. It hybridizes kernel methods to Deep Learning obtaining a scalable solution, an improved probabilistic model, and state-of-the-art performances. Both methods are tested through several experiments, with a central focus on their relevance in life sciences. Results show that they improve the performances achieved by their previous versions. At the applicative level, two pipelines are presented. The first one is for the analysis of RNA-Seq datasets, both transcriptomic and single-cell data, and is aimed at identifying genes that may be involved in biological processes (e.g., the transition of tissues from normal to cancer). In this project, an R package is released on CRAN to make the pipeline accessible to the bioinformatic Community through high-level APIs. The second pipeline is in the drug discovery domain and is useful for identifying druggable pockets, namely regions of a protein with a high probability of accepting a small molecule (a drug). Both these pipelines achieve remarkable results. Lastly, a detour application is developed to identify the strengths/limitations of the “Principal Path” algorithm by analyzing Convolutional Neural Networks induced vector spaces. This application is conducted in the music and visual arts domains.
Resumo:
This paper investigates how to make improved action selection for online policy learning in robotic scenarios using reinforcement learning (RL) algorithms. Since finding control policies using any RL algorithm can be very time consuming, we propose to combine RL algorithms with heuristic functions for selecting promising actions during the learning process. With this aim, we investigate the use of heuristics for increasing the rate of convergence of RL algorithms and contribute with a new learning algorithm, Heuristically Accelerated Q-learning (HAQL), which incorporates heuristics for action selection to the Q-Learning algorithm. Experimental results on robot navigation show that the use of even very simple heuristic functions results in significant performance enhancement of the learning rate.
Resumo:
Binning and truncation of data are common in data analysis and machine learning. This paper addresses the problem of fitting mixture densities to multivariate binned and truncated data. The EM approach proposed by McLachlan and Jones (Biometrics, 44: 2, 571-578, 1988) for the univariate case is generalized to multivariate measurements. The multivariate solution requires the evaluation of multidimensional integrals over each bin at each iteration of the EM procedure. Naive implementation of the procedure can lead to computationally inefficient results. To reduce the computational cost a number of straightforward numerical techniques are proposed. Results on simulated data indicate that the proposed methods can achieve significant computational gains with no loss in the accuracy of the final parameter estimates. Furthermore, experimental results suggest that with a sufficient number of bins and data points it is possible to estimate the true underlying density almost as well as if the data were not binned. The paper concludes with a brief description of an application of this approach to diagnosis of iron deficiency anemia, in the context of binned and truncated bivariate measurements of volume and hemoglobin concentration from an individual's red blood cells.