974 resultados para Gradient descent algorithms


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Developing an efficient and accurate hydrologic forecasting model is crucial to managing water resources and flooding issues. In this study, response surface (RS) models including multiple linear regression (MLR), quadratic response surface (QRS), and nonlinear response surface (NRS) were applied to daily runoff (e.g., discharge and water level) prediction. Two catchments, one in southeast China and the other in western Canada, were used to demonstrate the applicability of the proposed models. Their performances were compared with artificial neural network (ANN) models, trained with the learning algorithms of the gradient descent with adaptive learning rate (ANN-GDA) and Levenberg-Marquardt (ANN-LM). The performances of both RS and ANN in relation to the lags used in the input data, the length of the training samples, long-term (monthly and yearly) predictions, and peak value predictions were also analyzed. The results indicate that the QRS and NRS were able to obtain equally good performance in runoff prediction, as compared with ANN-GDA and ANN-LM, but require lower computational efforts. The RS models bring practical benefits in their application to hydrologic forecasting, particularly in the cases of short-term flood forecasting (e.g., hourly) due to fast training capability, and could be considered as an alternative to ANN

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Evolutionary algorithms perform optimization using a population of sample solution points. An interesting development has been to view population-based optimization as the process of evolving an explicit, probabilistic model of the search space. This paper investigates a formal basis for continuous, population-based optimization in terms of a stochastic gradient descent on the Kullback-Leibler divergence between the model probability density and the objective function, represented as an unknown density of assumed form. This leads to an update rule that is related and compared with previous theoretical work, a continuous version of the population-based incremental learning algorithm, and the generalized mean shift clustering framework. Experimental results are presented that demonstrate the dynamics of the new algorithm on a set of simple test problems.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We present an analytic solution to the problem of on-line gradient-descent learning for two-layer neural networks with an arbitrary number of hidden units in both teacher and student networks. The technique, demonstrated here for the case of adaptive input-to-hidden weights, becomes exact as the dimensionality of the input space increases.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We present a framework for calculating globally optimal parameters, within a given time frame, for on-line learning in multilayer neural networks. We demonstrate the capability of this method by computing optimal learning rates in typical learning scenarios. A similar treatment allows one to determine the relevance of related training algorithms based on modifications to the basic gradient descent rule as well as to compare different training methods.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

An adaptive back-propagation algorithm parameterized by an inverse temperature 1/T is studied and compared with gradient descent (standard back-propagation) for on-line learning in two-layer neural networks with an arbitrary number of hidden units. Within a statistical mechanics framework, we analyse these learning algorithms in both the symmetric and the convergence phase for finite learning rates in the case of uncorrelated teachers of similar but arbitrary length T. These analyses show that adaptive back-propagation results generally in faster training by breaking the symmetry between hidden units more efficiently and by providing faster convergence to optimal generalization than gradient descent.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A method for calculating the globally optimal learning rate in on-line gradient-descent training of multilayer neural networks is presented. The method is based on a variational approach which maximizes the decrease in generalization error over a given time frame. We demonstrate the method by computing optimal learning rates in typical learning scenarios. The method can also be employed when different learning rates are allowed for different parameter vectors as well as to determine the relevance of related training algorithms based on modifications to the basic gradient descent rule.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We analyse the dynamics of a number of second order on-line learning algorithms training multi-layer neural networks, using the methods of statistical mechanics. We first consider on-line Newton's method, which is known to provide optimal asymptotic performance. We determine the asymptotic generalization error decay for a soft committee machine, which is shown to compare favourably with the result for standard gradient descent. Matrix momentum provides a practical approximation to this method by allowing an efficient inversion of the Hessian. We consider an idealized matrix momentum algorithm which requires access to the Hessian and find close correspondence with the dynamics of on-line Newton's method. In practice, the Hessian will not be known on-line and we therefore consider matrix momentum using a single example approximation to the Hessian. In this case good asymptotic performance may still be achieved, but the algorithm is now sensitive to parameter choice because of noise in the Hessian estimate. On-line Newton's method is not appropriate during the transient learning phase, since a suboptimal unstable fixed point of the gradient descent dynamics becomes stable for this algorithm. A principled alternative is to use Amari's natural gradient learning algorithm and we show how this method provides a significant reduction in learning time when compared to gradient descent, while retaining the asymptotic performance of on-line Newton's method.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We analyse natural gradient learning in a two-layer feed-forward neural network using a statistical mechanics framework which is appropriate for large input dimension. We find significant improvement over standard gradient descent in both the transient and asymptotic phases of learning.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Natural gradient learning is an efficient and principled method for improving on-line learning. In practical applications there will be an increased cost required in estimating and inverting the Fisher information matrix. We propose to use the matrix momentum algorithm in order to carry out efficient inversion and study the efficacy of a single step estimation of the Fisher information matrix. We analyse the proposed algorithm in a two-layer network, using a statistical mechanics framework which allows us to describe analytically the learning dynamics, and compare performance with true natural gradient learning and standard gradient descent.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Motivated by environmental protection concerns, monitoring the flue gas of thermal power plant is now often mandatory due to the need to ensure that emission levels stay within safe limits. Optical based gas sensing systems are increasingly employed for this purpose, with regression techniques used to relate gas optical absorption spectra to the concentrations of specific gas components of interest (NOx, SO2 etc.). Accurately predicting gas concentrations from absorption spectra remains a challenging problem due to the presence of nonlinearities in the relationships and the high-dimensional and correlated nature of the spectral data. This article proposes a generalized fuzzy linguistic model (GFLM) to address this challenge. The GFLM is made up of a series of “If-Then” fuzzy rules. The absorption spectra are input variables in the rule antecedent. The rule consequent is a general nonlinear polynomial function of the absorption spectra. Model parameters are estimated using least squares and gradient descent optimization algorithms. The performance of GFLM is compared with other traditional prediction models, such as partial least squares, support vector machines, multilayer perceptron neural networks and radial basis function networks, for two real flue gas spectral datasets: one from a coal-fired power plant and one from a gas-fired power plant. The experimental results show that the generalized fuzzy linguistic model has good predictive ability, and is competitive with alternative approaches, while having the added advantage of providing an interpretable model.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Motivated by environmental protection concerns, monitoring the flue gas of thermal power plant is now often mandatory due to the need to ensure that emission levels stay within safe limits. Optical based gas sensing systems are increasingly employed for this purpose, with regression techniques used to relate gas optical absorption spectra to the concentrations of specific gas components of interest (NOx, SO2 etc.). Accurately predicting gas concentrations from absorption spectra remains a challenging problem due to the presence of nonlinearities in the relationships and the high-dimensional and correlated nature of the spectral data. This article proposes a generalized fuzzy linguistic model (GFLM) to address this challenge. The GFLM is made up of a series of “If-Then” fuzzy rules. The absorption spectra are input variables in the rule antecedent. The rule consequent is a general nonlinear polynomial function of the absorption spectra. Model parameters are estimated using least squares and gradient descent optimization algorithms. The performance of GFLM is compared with other traditional prediction models, such as partial least squares, support vector machines, multilayer perceptron neural networks and radial basis function networks, for two real flue gas spectral datasets: one from a coal-fired power plant and one from a gas-fired power plant. The experimental results show that the generalized fuzzy linguistic model has good predictive ability, and is competitive with alternative approaches, while having the added advantage of providing an interpretable model.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Les réseaux de capteurs sont formés d’un ensemble de dispositifs capables de prendre individuellement des mesures d’un environnement particulier et d’échanger de l’information afin d’obtenir une représentation de haut niveau sur les activités en cours dans la zone d’intérêt. Une telle détection distribuée, avec de nombreux appareils situés à proximité des phénomènes d’intérêt, est pertinente dans des domaines tels que la surveillance, l’agriculture, l’observation environnementale, la surveillance industrielle, etc. Nous proposons dans cette thèse plusieurs approches pour effectuer l’optimisation des opérations spatio-temporelles de ces dispositifs, en déterminant où les placer dans l’environnement et comment les contrôler au fil du temps afin de détecter les cibles mobiles d’intérêt. La première nouveauté consiste en un modèle de détection réaliste représentant la couverture d’un réseau de capteurs dans son environnement. Nous proposons pour cela un modèle 3D probabiliste de la capacité de détection d’un capteur sur ses abords. Ce modèle inègre également de l’information sur l’environnement grâce à l’évaluation de la visibilité selon le champ de vision. À partir de ce modèle de détection, l’optimisation spatiale est effectuée par la recherche du meilleur emplacement et l’orientation de chaque capteur du réseau. Pour ce faire, nous proposons un nouvel algorithme basé sur la descente du gradient qui a été favorablement comparée avec d’autres méthodes génériques d’optimisation «boites noires» sous l’aspect de la couverture du terrain, tout en étant plus efficace en terme de calculs. Une fois que les capteurs placés dans l’environnement, l’optimisation temporelle consiste à bien couvrir un groupe de cibles mobiles dans l’environnement. D’abord, on effectue la prédiction de la position future des cibles mobiles détectées par les capteurs. La prédiction se fait soit à l’aide de l’historique des autres cibles qui ont traversé le même environnement (prédiction à long terme), ou seulement en utilisant les déplacements précédents de la même cible (prédiction à court terme). Nous proposons de nouveaux algorithmes dans chaque catégorie qui performent mieux ou produits des résultats comparables par rapport aux méthodes existantes. Une fois que les futurs emplacements de cibles sont prédits, les paramètres des capteurs sont optimisés afin que les cibles soient correctement couvertes pendant un certain temps, selon les prédictions. À cet effet, nous proposons une méthode heuristique pour faire un contrôle de capteurs, qui se base sur les prévisions probabilistes de trajectoire des cibles et également sur la couverture probabiliste des capteurs des cibles. Et pour terminer, les méthodes d’optimisation spatiales et temporelles proposées ont été intégrées et appliquées avec succès, ce qui démontre une approche complète et efficace pour l’optimisation spatio-temporelle des réseaux de capteurs.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This thesis investigates the problem of robot navigation using only landmark bearings. The proposed system allows a robot to move to a ground target location specified by the sensor values observed at this ground target posi- tion. The control actions are computed based on the difference between the current landmark bearings and the target landmark bearings. No Cartesian coordinates with respect to the ground are computed by the control system. The robot navigates using solely information from the bearing sensor space. Most existing robot navigation systems require a ground frame (2D Cartesian coordinate system) in order to navigate from a ground point A to a ground point B. The commonly used sensors such as laser range scanner, sonar, infrared, and vision do not directly provide the 2D ground coordi- nates of the robot. The existing systems use the sensor measurements to localise the robot with respect to a map, a set of 2D coordinates of the objects of interest. It is more natural to navigate between the points in the sensor space corresponding to A and B without requiring the Cartesian map and the localisation process. Research on animals has revealed how insects are able to exploit very limited computational and memory resources to successfully navigate to a desired destination without computing Cartesian positions. For example, a honeybee balances the left and right optical flows to navigate in a nar- row corridor. Unlike many other ants, Cataglyphis bicolor does not secrete pheromone trails in order to find its way home but instead uses the sun as a compass to keep track of its home direction vector. The home vector can be inaccurate, so the ant also uses landmark recognition. More precisely, it takes snapshots and compass headings of some landmarks. To return home, the ant tries to line up the landmarks exactly as they were before it started wandering. This thesis introduces a navigation method based on reflex actions in sensor space. The sensor vector is made of the bearings of some landmarks, and the reflex action is a gradient descent with respect to the distance in sensor space between the current sensor vector and the target sensor vec- tor. Our theoretical analysis shows that except for some fully characterized pathological cases, any point is reachable from any other point by reflex action in the bearing sensor space provided the environment contains three landmarks and is free of obstacles. The trajectories of a robot using reflex navigation, like other image- based visual control strategies, do not correspond necessarily to the shortest paths on the ground, because the sensor error is minimized, not the moving distance on the ground. However, we show that the use of a sequence of waypoints in sensor space can address this problem. In order to identify relevant waypoints, we train a Self Organising Map (SOM) from a set of observations uniformly distributed with respect to the ground. This SOM provides a sense of location to the robot, and allows a form of path planning in sensor space. The navigation proposed system is analysed theoretically, and evaluated both in simulation and with experiments on a real robot.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The selection criteria for contractor pre-qualification are characterized by the co-existence of both quantitative and qualitative data. The qualitative data is non-linear, uncertain and imprecise. An ideal decision support system for contractor pre-qualification should have the ability of handling both quantitative and qualitative data, and of mapping the complicated nonlinear relationship of the selection criteria, such that rational and consistent decisions can be made. In this research paper, an artificial neural network model was developed to assist public clients identifying suitable contractors for tendering. The pre-qualification criteria (variables) were identified for the model. One hundred and twelve real pre-qualification cases were collected from civil engineering projects in Hong Kong, and eighty-eight hypothetical pre-qualification cases were also generated according to the “If-then” rules used by professionals in the pre-qualification process. The results of the analysis totally comply with current practice (public developers in Hong Kong). Each pre-qualification case consisted of input ratings for candidate contractors’ attributes and their corresponding pre-qualification decisions. The training of the neural network model was accomplished by using the developed program, in which a conjugate gradient descent algorithm was incorporated for improving the learning performance of the network. Cross-validation was applied to estimate the generalization errors based on the “re-sampling” of training pairs. The case studies show that the artificial neural network model is suitable for mapping the complicated nonlinear relationship between contractors’ attributes and their corresponding pre-qualification (disqualification) decisions. The artificial neural network model can be concluded as an ideal alternative for performing the contractor pre-qualification task.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper introduces a novel technique to directly optimise the Figure of Merit (FOM) for phonetic spoken term detection. The FOM is a popular measure of sTD accuracy, making it an ideal candiate for use as an objective function. A simple linear model is introduced to transform the phone log-posterior probabilities output by a phe classifier to produce enhanced log-posterior features that are more suitable for the STD task. Direct optimisation of the FOM is then performed by training the parameters of this model using a non-linear gradient descent algorithm. Substantial FOM improvements of 11% relative are achieved on held-out evaluation data, demonstrating the generalisability of the approach.