986 resultados para Training algorithms


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Cette thèse porte sur une classe d'algorithmes d'apprentissage appelés architectures profondes. Il existe des résultats qui indiquent que les représentations peu profondes et locales ne sont pas suffisantes pour la modélisation des fonctions comportant plusieurs facteurs de variation. Nous sommes particulièrement intéressés par ce genre de données car nous espérons qu'un agent intelligent sera en mesure d'apprendre à les modéliser automatiquement; l'hypothèse est que les architectures profondes sont mieux adaptées pour les modéliser. Les travaux de Hinton (2006) furent une véritable percée, car l'idée d'utiliser un algorithme d'apprentissage non-supervisé, les machines de Boltzmann restreintes, pour l'initialisation des poids d'un réseau de neurones supervisé a été cruciale pour entraîner l'architecture profonde la plus populaire, soit les réseaux de neurones artificiels avec des poids totalement connectés. Cette idée a été reprise et reproduite avec succès dans plusieurs contextes et avec une variété de modèles. Dans le cadre de cette thèse, nous considérons les architectures profondes comme des biais inductifs. Ces biais sont représentés non seulement par les modèles eux-mêmes, mais aussi par les méthodes d'entraînement qui sont souvent utilisés en conjonction avec ceux-ci. Nous désirons définir les raisons pour lesquelles cette classe de fonctions généralise bien, les situations auxquelles ces fonctions pourront être appliquées, ainsi que les descriptions qualitatives de telles fonctions. L'objectif de cette thèse est d'obtenir une meilleure compréhension du succès des architectures profondes. Dans le premier article, nous testons la concordance entre nos intuitions---que les réseaux profonds sont nécessaires pour mieux apprendre avec des données comportant plusieurs facteurs de variation---et les résultats empiriques. Le second article est une étude approfondie de la question: pourquoi l'apprentissage non-supervisé aide à mieux généraliser dans un réseau profond? Nous explorons et évaluons plusieurs hypothèses tentant d'élucider le fonctionnement de ces modèles. Finalement, le troisième article cherche à définir de façon qualitative les fonctions modélisées par un réseau profond. Ces visualisations facilitent l'interprétation des représentations et invariances modélisées par une architecture profonde.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Les algorithmes d'apprentissage profond forment un nouvel ensemble de méthodes puissantes pour l'apprentissage automatique. L'idée est de combiner des couches de facteurs latents en hierarchies. Cela requiert souvent un coût computationel plus elevé et augmente aussi le nombre de paramètres du modèle. Ainsi, l'utilisation de ces méthodes sur des problèmes à plus grande échelle demande de réduire leur coût et aussi d'améliorer leur régularisation et leur optimization. Cette thèse adresse cette question sur ces trois perspectives. Nous étudions tout d'abord le problème de réduire le coût de certains algorithmes profonds. Nous proposons deux méthodes pour entrainer des machines de Boltzmann restreintes et des auto-encodeurs débruitants sur des distributions sparses à haute dimension. Ceci est important pour l'application de ces algorithmes pour le traitement de langues naturelles. Ces deux méthodes (Dauphin et al., 2011; Dauphin and Bengio, 2013) utilisent l'échantillonage par importance pour échantilloner l'objectif de ces modèles. Nous observons que cela réduit significativement le temps d'entrainement. L'accéleration atteint 2 ordres de magnitude sur plusieurs bancs d'essai. Deuxièmement, nous introduisont un puissant régularisateur pour les méthodes profondes. Les résultats expérimentaux démontrent qu'un bon régularisateur est crucial pour obtenir de bonnes performances avec des gros réseaux (Hinton et al., 2012). Dans Rifai et al. (2011), nous proposons un nouveau régularisateur qui combine l'apprentissage non-supervisé et la propagation de tangente (Simard et al., 1992). Cette méthode exploite des principes géometriques et permit au moment de la publication d'atteindre des résultats à l'état de l'art. Finalement, nous considérons le problème d'optimiser des surfaces non-convexes à haute dimensionalité comme celle des réseaux de neurones. Tradionellement, l'abondance de minimum locaux était considéré comme la principale difficulté dans ces problèmes. Dans Dauphin et al. (2014a) nous argumentons à partir de résultats en statistique physique, de la théorie des matrices aléatoires, de la théorie des réseaux de neurones et à partir de résultats expérimentaux qu'une difficulté plus profonde provient de la prolifération de points-selle. Dans ce papier nous proposons aussi une nouvelle méthode pour l'optimisation non-convexe.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Support Vector Machine (SVM) is a new and very promising classification technique developed by Vapnik and his group at AT&T Bell Labs. This new learning algorithm can be seen as an alternative training technique for Polynomial, Radial Basis Function and Multi-Layer Perceptron classifiers. An interesting property of this approach is that it is an approximate implementation of the Structural Risk Minimization (SRM) induction principle. The derivation of Support Vector Machines, its relationship with SRM, and its geometrical insight, are discussed in this paper. Training a SVM is equivalent to solve a quadratic programming problem with linear and box constraints in a number of variables equal to the number of data points. When the number of data points exceeds few thousands the problem is very challenging, because the quadratic form is completely dense, so the memory needed to store the problem grows with the square of the number of data points. Therefore, training problems arising in some real applications with large data sets are impossible to load into memory, and cannot be solved using standard non-linear constrained optimization algorithms. We present a decomposition algorithm that can be used to train SVM's over large data sets. The main idea behind the decomposition is the iterative solution of sub-problems and the evaluation of, and also establish the stopping criteria for the algorithm. We present previous approaches, as well as results and important details of our implementation of the algorithm using a second-order variant of the Reduced Gradient Method as the solver of the sub-problems. As an application of SVM's, we present preliminary results we obtained applying SVM to the problem of detecting frontal human faces in real images.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Co-training is a semi-supervised learning method that is designed to take advantage of the redundancy that is present when the object to be identified has multiple descriptions. Co-training is known to work well when the multiple descriptions are conditional independent given the class of the object. The presence of multiple descriptions of objects in the form of text, images, audio and video in multimedia applications appears to provide redundancy in the form that may be suitable for co-training. In this paper, we investigate the suitability of utilizing text and image data from the Web for co-training. We perform measurements to find indications of conditional independence in the texts and images obtained from the Web. Our measurements suggest that conditional independence is likely to be present in the data. Our experiments, within a relevance feedback framework to test whether a method that exploits the conditional independence outperforms methods that do not, also indicate that better performance can indeed be obtained by designing algorithms that exploit this form of the redundancy when it is present.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the U.K., dental students require to perform training and practice on real human tissues at the very early stage of their courses. Currently, the human tissues, such as decayed teeth, are mounted in a human head like physical model. The problems with these models in teaching are; (1) every student operates on tooth, which are always unique; (2) the process cannot be recorded for examination purposes and (3) same training are not repeatable. The aim of the PHATOM Project is to develop a dental training system using Haptic technology. This paper documents the project background, specification, research and development of the first prototype system. It also discusses the research in the visual display, haptic devices and haptic rendering. This includes stereo vision, motion parallax, volumetric modelling, surface remapping algorithms as well as analysis design of the system. A new volumetric to surface model transformation algorithm is also introduced. This paper includes the future work on the system development and research.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An extensive set of machine learning and pattern classification techniques trained and tested on KDD dataset failed in detecting most of the user-to-root attacks. This paper aims to provide an approach for mitigating negative aspects of the mentioned dataset, which led to low detection rates. Genetic algorithm is employed to implement rules for detecting various types of attacks. Rules are formed of the features of the dataset identified as the most important ones for each attack type. In this way we introduce high level of generality and thus achieve high detection rates, but also gain high reduction of the system training time. Thenceforth we re-check the decision of the user-to- root rules with the rules that detect other types of attacks. In this way we decrease the false-positive rate. The model was verified on KDD 99, demonstrating higher detection rates than those reported by the state- of-the-art while maintaining low false-positive rate.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The speed of convergence while training is an important consideration in the use of neural nets. The authors outline a new training algorithm which reduces both the number of iterations and training time required for convergence of multilayer perceptrons, compared to standard back-propagation and conjugate gradient descent algorithms.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Algorithms for computer-aided diagnosis of dementia based on structural MRI have demonstrated high performance in the literature, but are difficult to compare as different data sets and methodology were used for evaluation. In addition, it is unclear how the algorithms would perform on previously unseen data, and thus, how they would perform in clinical practice when there is no real opportunity to adapt the algorithm to the data at hand. To address these comparability, generalizability and clinical applicability issues, we organized a grand challenge that aimed to objectively compare algorithms based on a clinically representative multi-center data set. Using clinical practice as the starting point, the goal was to reproduce the clinical diagnosis. Therefore, we evaluated algorithms for multi-class classification of three diagnostic groups: patients with probable Alzheimer's disease, patients with mild cognitive impairment and healthy controls. The diagnosis based on clinical criteria was used as reference standard, as it was the best available reference despite its known limitations. For evaluation, a previously unseen test set was used consisting of 354 T1-weighted MRI scans with the diagnoses blinded. Fifteen research teams participated with a total of 29 algorithms. The algorithms were trained on a small training set (n = 30) and optionally on data from other sources (e.g., the Alzheimer's Disease Neuroimaging Initiative, the Australian Imaging Biomarkers and Lifestyle flagship study of aging). The best performing algorithm yielded an accuracy of 63.0% and an area under the receiver-operating-characteristic curve (AUC) of 78.8%. In general, the best performances were achieved using feature extraction based on voxel-based morphometry or a combination of features that included volume, cortical thickness, shape and intensity. The challenge is open for new submissions via the web-based framework: http://caddementia.grand-challenge.org.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Metaheuristic algorithm is one of the most popular methods in solving many optimization problems. This paper presents a new hybrid approach comprising of two natures inspired metaheuristic algorithms i.e. Cuckoo Search (CS) and Accelerated Particle Swarm Optimization (APSO) for training Artificial Neural Networks (ANN). In order to increase the probability of the egg’s survival, the cuckoo bird migrates by traversing more search space. It can successfully search better solutions by performing levy flight with APSO. In the proposed Hybrid Accelerated Cuckoo Particle Swarm Optimization (HACPSO) algorithm, the communication ability for the cuckoo birds have been provided by APSO, thus making cuckoo bird capable of searching for the best nest with better solution. Experimental results are carried-out on benchmarked datasets, and the performance of the proposed hybrid algorithm is compared with Artificial Bee Colony (ABC) and similar hybrid variants. The results show that the proposed HACPSO algorithm performs better than other algorithms in terms of convergence and accuracy.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Understanding the real world based on visualisation and prediction is essential for the decision-maker. We build a computational virtual reality environment to improve visualisation, understanding and prediction of the physical world and to guide action. It develops a five-dimensional, computer-generated, computational Virtual Reality Environment for Anaesthesia (VREA). Our online prediction will be calculated based on the correlation and composition computing with respect to the three dimensions: horizontal, vertical and individual. The novel musical notes based anesthetic simulator is proposed to identify the abnormality and visualize the online medical time series. The experiments with the online ECG data will present a real-time case to show the effectiveness and efficiency of our proposed system and algorithms.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Karnik-Mendel (KM) algorithm is the most widely used type reduction (TR) method in literature for the design of interval type-2 fuzzy logic systems (IT2FLS). Its iterative nature for finding left and right switch points is its Achilles heel. Despite a decade of research, none of the alternative TR methods offer uncertainty measures equivalent to KM algorithm. This paper takes a data-driven approach to tackle the computational burden of this algorithm while keeping its key features. We propose a regression method to approximate left and right switch points found by KM algorithm. Approximator only uses the firing intervals, rnles centroids, and FLS strnctural features as inputs. Once training is done, it can precisely approximate the left and right switch points through basic vector multiplications. Comprehensive simulation results demonstrate that the approximation accuracy for a wide variety of FLSs is 100%. Flexibility, ease of implementation, and speed are other features of the proposed method.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Backpropagation Algorithm (BA) is the standard method for training multilayer Artificial Neural Networks (ANN), although it converges very slowly and can stop in a local minimum. We present a new method for neural network training using the BA inspired on constructivism, an alphabetization method proposed by Emilia Ferreiro based on Piaget philosophy. Simulation results show that the proposed configuration usually obtains a lower final mean square error, when compared with the standard BA and with the BA with momentum factor.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Discriminative training of Gaussian Mixture Models (GMMs) for speech or speaker recognition purposes is usually based on the gradient descent method, in which the iteration step-size, ε, uses to be defined experimentally. In this letter, we derive an equation to adaptively determine ε, by showing that the second-order Newton-Raphson iterative method to find roots of equations is equivalent to the gradient descent algorithm. © 2010 IEEE.