885 resultados para Q-learning algorithm


Relevância:

80.00% 80.00%

Publicador:

Resumo:

It is well known that the addition of noise to the input data of a neural network during training can, in some circumstances, lead to significant improvements in generalization performance. Previous work has shown that such training with noise is equivalent to a form of regularization in which an extra term is added to the error function. However, the regularization term, which involves second derivatives of the error function, is not bounded below, and so can lead to difficulties if used directly in a learning algorithm based on error minimization. In this paper we show that, for the purposes of network training, the regularization term can be reduced to a positive definite form which involves only first derivatives of the network mapping. For a sum-of-squares error function, the regularization term belongs to the class of generalized Tikhonov regularizers. Direct minimization of the regularized error function provides a practical alternative to training with noise.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We propose a Bayesian framework for regression problems, which covers areas which are usually dealt with by function approximation. An online learning algorithm is derived which solves regression problems with a Kalman filter. Its solution always improves with increasing model complexity, without the risk of over-fitting. In the infinite dimension limit it approaches the true Bayesian posterior. The issues of prior selection and over-fitting are also discussed, showing that some of the commonly held beliefs are misleading. The practical implementation is summarised. Simulations using 13 popular publicly available data sets are used to demonstrate the method and highlight important issues concerning the choice of priors.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In recent years there has been an increased interest in applying non-parametric methods to real-world problems. Significant research has been devoted to Gaussian processes (GPs) due to their increased flexibility when compared with parametric models. These methods use Bayesian learning, which generally leads to analytically intractable posteriors. This thesis proposes a two-step solution to construct a probabilistic approximation to the posterior. In the first step we adapt the Bayesian online learning to GPs: the final approximation to the posterior is the result of propagating the first and second moments of intermediate posteriors obtained by combining a new example with the previous approximation. The propagation of em functional forms is solved by showing the existence of a parametrisation to posterior moments that uses combinations of the kernel function at the training points, transforming the Bayesian online learning of functions into a parametric formulation. The drawback is the prohibitive quadratic scaling of the number of parameters with the size of the data, making the method inapplicable to large datasets. The second step solves the problem of the exploding parameter size and makes GPs applicable to arbitrarily large datasets. The approximation is based on a measure of distance between two GPs, the KL-divergence between GPs. This second approximation is with a constrained GP in which only a small subset of the whole training dataset is used to represent the GP. This subset is called the em Basis Vector, or BV set and the resulting GP is a sparse approximation to the true posterior. As this sparsity is based on the KL-minimisation, it is probabilistic and independent of the way the posterior approximation from the first step is obtained. We combine the sparse approximation with an extension to the Bayesian online algorithm that allows multiple iterations for each input and thus approximating a batch solution. The resulting sparse learning algorithm is a generic one: for different problems we only change the likelihood. The algorithm is applied to a variety of problems and we examine its performance both on more classical regression and classification tasks and to the data-assimilation and a simple density estimation problems.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A number of researchers have investigated the application of neural networks to visual recognition, with much of the emphasis placed on exploiting the network's ability to generalise. However, despite the benefits of such an approach it is not at all obvious how networks can be developed which are capable of recognising objects subject to changes in rotation, translation and viewpoint. In this study, we suggest that a possible solution to this problem can be found by studying aspects of visual psychology and in particular, perceptual organisation. For example, it appears that grouping together lines based upon perceptually significant features can facilitate viewpoint independent recognition. The work presented here identifies simple grouping measures based on parallelism and connectivity and shows how it is possible to train multi-layer perceptrons (MLPs) to detect and determine the perceptual significance of any group presented. In this way, it is shown how MLPs which are trained via backpropagation to perform individual grouping tasks, can be brought together into a novel, large scale network capable of determining the perceptual significance of the whole input pattern. Finally the applicability of such significance values for recognition is investigated and results indicate that both the NILP and the Kohonen Feature Map can be trained to recognise simple shapes described in terms of perceptual significances. This study has also provided an opportunity to investigate aspects of the backpropagation algorithm, particularly the ability to generalise. In this study we report the results of various generalisation tests. In applying the backpropagation algorithm to certain problems, we found that there was a deficiency in performance with the standard learning algorithm. An improvement in performance could however, be obtained when suitable modifications were made to the algorithm. The modifications and consequent results are reported here.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We explore the effects of over-specificity in learning algorithms by investigating the behavior of a student, suited to learn optimally from a teacher B, learning from a teacher B' ? B. We only considered the supervised, on-line learning scenario with teachers selected from a particular family. We found that, in the general case, the application of the optimal algorithm to the wrong teacher produces a residual generalization error, even if the right teacher is harder. By imposing mild conditions to the learning algorithm form, we obtained an approximation for the residual generalization error. Simulations carried out in finite networks validate the estimate found.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We present CORDER (COmmunity Relation Discovery by named Entity Recognition) an un-supervised machine learning algorithm that exploits named entity recognition and co-occurrence data to associate individuals in an organization with their expertise and associates. We discuss the problems associated with evaluating unsupervised learners and report our initial evaluation experiments.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

General Regression Neuro-Fuzzy Network, which combines the properties of conventional General Regression Neural Network and Adaptive Network-based Fuzzy Inference System is proposed in this work. This network relates to so-called “memory-based networks”, which is adjusted by one-pass learning algorithm.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper addresses the task of learning classifiers from streams of labelled data. In this case we can face the problem that the underlying concepts can change over time. The paper studies two mechanisms developed for dealing with changing concepts. Both are based on the time window idea. The first one forgets gradually, by assigning to the examples weight that gradually decreases over time. The second one uses a statistical test to detect changes in concept and then optimizes the size of the time window, aiming to maximise the classification accuracy on the new examples. Both methods are general in nature and can be used with any learning algorithm. The objectives of the conducted experiments were to compare the mechanisms and explore whether they can be combined to achieve a synergetic e ect. Results from experiments with three basic learning algorithms (kNN, ID3 and NBC) using four datasets are reported and discussed.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper a new double-wavelet neuron architecture obtained by modification of standard wavelet neuron, and its learning algorithm are proposed. The offered architecture allows to improve the approximation properties of wavelet neuron. Double-wavelet neuron and its learning algorithm are examined for forecasting non-stationary chaotic time series.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

2000 Mathematics Subject Classification: 62P99, 68T50

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The main challenges of multimedia data retrieval lie in the effective mapping between low-level features and high-level concepts, and in the individual users' subjective perceptions of multimedia content. ^ The objectives of this dissertation are to develop an integrated multimedia indexing and retrieval framework with the aim to bridge the gap between semantic concepts and low-level features. To achieve this goal, a set of core techniques have been developed, including image segmentation, content-based image retrieval, object tracking, video indexing, and video event detection. These core techniques are integrated in a systematic way to enable the semantic search for images/videos, and can be tailored to solve the problems in other multimedia related domains. In image retrieval, two new methods of bridging the semantic gap are proposed: (1) for general content-based image retrieval, a stochastic mechanism is utilized to enable the long-term learning of high-level concepts from a set of training data, such as user access frequencies and access patterns of images. (2) In addition to whole-image retrieval, a novel multiple instance learning framework is proposed for object-based image retrieval, by which a user is allowed to more effectively search for images that contain multiple objects of interest. An enhanced image segmentation algorithm is developed to extract the object information from images. This segmentation algorithm is further used in video indexing and retrieval, by which a robust video shot/scene segmentation method is developed based on low-level visual feature comparison, object tracking, and audio analysis. Based on shot boundaries, a novel data mining framework is further proposed to detect events in soccer videos, while fully utilizing the multi-modality features and object information obtained through video shot/scene detection. ^ Another contribution of this dissertation is the potential of the above techniques to be tailored and applied to other multimedia applications. This is demonstrated by their utilization in traffic video surveillance applications. The enhanced image segmentation algorithm, coupled with an adaptive background learning algorithm, improves the performance of vehicle identification. A sophisticated object tracking algorithm is proposed to track individual vehicles, while the spatial and temporal relationships of vehicle objects are modeled by an abstract semantic model. ^

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This research is to establish new optimization methods for pattern recognition and classification of different white blood cells in actual patient data to enhance the process of diagnosis. Beckman-Coulter Corporation supplied flow cytometry data of numerous patients that are used as training sets to exploit the different physiological characteristics of the different samples provided. The methods of Support Vector Machines (SVM) and Artificial Neural Networks (ANN) were used as promising pattern classification techniques to identify different white blood cell samples and provide information to medical doctors in the form of diagnostic references for the specific disease states, leukemia. The obtained results prove that when a neural network classifier is well configured and trained with cross-validation, it can perform better than support vector classifiers alone for this type of data. Furthermore, a new unsupervised learning algorithm---Density based Adaptive Window Clustering algorithm (DAWC) was designed to process large volumes of data for finding location of high data cluster in real-time. It reduces the computational load to ∼O(N) number of computations, and thus making the algorithm more attractive and faster than current hierarchical algorithms.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Today, smart-phones have revolutionized wireless communication industry towards an era of mobile data. To cater for the ever increasing data traffic demand, it is of utmost importance to have more spectrum resources whereby sharing under-utilized spectrum bands is an effective solution. In particular, the 4G broadband Long Term Evolution (LTE) technology and its foreseen 5G successor will benefit immensely if their operation can be extended to the under-utilized unlicensed spectrum. In this thesis, first we analyze WiFi 802.11n and LTE coexistence performance in the unlicensed spectrum considering multi-layer cell layouts through system level simulations. We consider a time division duplexing (TDD)-LTE system with an FTP traffic model for performance evaluation. Simulation results show that WiFi performance is more vulnerable to LTE interference, while LTE performance is degraded only slightly. Based on the initial findings, we propose a Q-Learning based dynamic duty cycle selection technique for configuring LTE transmission gaps, so that a satisfactory throughput is maintained both for LTE and WiFi systems. Simulation results show that the proposed approach can enhance the overall capacity performance by 19% and WiFi capacity performance by 77%, hence enabling effective coexistence of LTE and WiFi systems in the unlicensed band.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Técnicas de otimização conhecidas como as metaheurísticas tem conseguido resolversatisfatoriamente problemas conhecidos, mas desenvolvimento das metaheurísticas écaracterizado por escolha de parâmetros para sua execução, na qual a opção apropriadadestes parâmetros (valores). Onde o ajuste de parâmetro é essencial testa-se os parâmetrosaté que resultados viáveis sejam obtidos, normalmente feita pelo desenvolvedor que estaimplementando a metaheuristica. A qualidade dos resultados de uma instância1 de testenão será transferida para outras instâncias a serem testadas e seu feedback pode requererum processo lento de “tentativa e erro” onde o algoritmo têm que ser ajustado para umaaplicação especifica. Diante deste contexto das metaheurísticas surgiu a Busca Reativaque defende a integração entre o aprendizado de máquina dentro de buscas heurísticaspara solucionar problemas de otimização complexos. A partir da integração que a BuscaReativa propõe entre o aprendizado de máquina e as metaheurísticas, surgiu a ideia dese colocar a Aprendizagem por Reforço mais especificamente o algoritmo Q-learning deforma reativa, para selecionar qual busca local é a mais indicada em determinado instanteda busca, para suceder uma outra busca local que não pode mais melhorar a soluçãocorrente na metaheurística VNS. Assim, neste trabalho propomos uma implementação reativa,utilizando aprendizado por reforço para o auto-tuning do algoritmo implementado,aplicado ao problema do caixeiro viajante simétrico e ao problema escalonamento sondaspara manutenção de poços.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The social media classification problems draw more and more attention in the past few years. With the rapid development of Internet and the popularity of computers, there is astronomical amount of information in the social network (social media platforms). The datasets are generally large scale and are often corrupted by noise. The presence of noise in training set has strong impact on the performance of supervised learning (classification) techniques. A budget-driven One-class SVM approach is presented in this thesis that is suitable for large scale social media data classification. Our approach is based on an existing online One-class SVM learning algorithm, referred as STOCS (Self-Tuning One-Class SVM) algorithm. To justify our choice, we first analyze the noise-resilient ability of STOCS using synthetic data. The experiments suggest that STOCS is more robust against label noise than several other existing approaches. Next, to handle big data classification problem for social media data, we introduce several budget driven features, which allow the algorithm to be trained within limited time and under limited memory requirement. Besides, the resulting algorithm can be easily adapted to changes in dynamic data with minimal computational cost. Compared with two state-of-the-art approaches, Lib-Linear and kNN, our approach is shown to be competitive with lower requirements of memory and time.