84 resultados para bidirectional associative memory neural networks


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Online model order complexity estimation remains one of the key problems in neural network research. The problem is further exacerbated in situations where the underlying system generator is non-stationary. In this paper, we introduce a novelty criterion for resource allocating networks (RANs) which is capable of being applied to both stationary and slowly varying non-stationary problems. The deficiencies of existing novelty criteria are discussed and the relative performances are demonstrated on two real-world problems : electricity load forecasting and exchange rate prediction.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

For neural networks with a wide class of weight-priors, it can be shown that in the limit of an infinite number of hidden units the prior over functions tends to a Gaussian process. In this paper analytic forms are derived for the covariance function of the Gaussian processes corresponding to networks with sigmoidal and Gaussian hidden units. This allows predictions to be made efficiently using networks with an infinite number of hidden units, and shows that, somewhat paradoxically, it may be easier to compute with infinite networks than finite ones.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Mixture Density Networks (MDNs) are a well-established method for modelling the conditional probability density which is useful for complex multi-valued functions where regression methods (such as MLPs) fail. In this paper we extend earlier research of a regularisation method for a special case of MDNs to the general case using evidence based regularisation and we show how the Hessian of the MDN error function can be evaluated using R-propagation. The method is tested on two data sets and compared with early stopping.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper reports the initial results of a joint research project carried out by Aston University and Lloyd's Register to develop a practical method of assessing neural network applications. A set of assessment guidelines for neural network applications were developed and tested on two applications. These case studies showed that it is practical to assess neural networks in a statistical pattern recognition framework. However there is need for more standardisation in neural network technology and a wider takeup of good development practice amongst the neural network community.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

An adaptive back-propagation algorithm parameterized by an inverse temperature 1/T is studied and compared with gradient descent (standard back-propagation) for on-line learning in two-layer neural networks with an arbitrary number of hidden units. Within a statistical mechanics framework, we analyse these learning algorithms in both the symmetric and the convergence phase for finite learning rates in the case of uncorrelated teachers of similar but arbitrary length T. These analyses show that adaptive back-propagation results generally in faster training by breaking the symmetry between hidden units more efficiently and by providing faster convergence to optimal generalization than gradient descent.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Radial Basis Function networks with linear outputs are often used in regression problems because they can be substantially faster to train than Multi-layer Perceptrons. For classification problems, the use of linear outputs is less appropriate as the outputs are not guaranteed to represent probabilities. In this paper we show how RBFs with logistic and softmax outputs can be trained efficiently using algorithms derived from Generalised Linear Models. This approach is compared with standard non-linear optimisation algorithms on a number of datasets.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Mixture Density Networks are a principled method to model conditional probability density functions which are non-Gaussian. This is achieved by modelling the conditional distribution for each pattern with a Gaussian Mixture Model for which the parameters are generated by a neural network. This thesis presents a novel method to introduce regularisation in this context for the special case where the mean and variance of the spherical Gaussian Kernels in the mixtures are fixed to predetermined values. Guidelines for how these parameters can be initialised are given, and it is shown how to apply the evidence framework to mixture density networks to achieve regularisation. This also provides an objective stopping criteria that can replace the `early stopping' methods that have previously been used. If the neural network used is an RBF network with fixed centres this opens up new opportunities for improved initialisation of the network weights, which are exploited to start training relatively close to the optimum. The new method is demonstrated on two data sets. The first is a simple synthetic data set while the second is a real life data set, namely satellite scatterometer data used to infer the wind speed and wind direction near the ocean surface. For both data sets the regularisation method performs well in comparison with earlier published results. Ideas on how the constraint on the kernels may be relaxed to allow fully adaptable kernels are presented.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

It is generally assumed when using Bayesian inference methods for neural networks that the input data contains no noise or corruption. For real-world (errors in variable) problems this is clearly an unsafe assumption. This paper presents a Bayesian neural network framework which allows for input noise given that some model of the noise process exists. In the limit where this noise process is small and symmetric it is shown, using the Laplace approximation, that there is an additional term to the usual Bayesian error bar which depends on the variance of the input noise process. Further, by treating the true (noiseless) input as a hidden variable and sampling this jointly with the network's weights, using Markov Chain Monte Carlo methods, it is demonstrated that it is possible to infer the unbiassed regression over the noiseless input.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Using methods of Statistical Physics, we investigate the generalization performance of support vector machines (SVMs), which have been recently introduced as a general alternative to neural networks. For nonlinear classification rules, the generalization error saturates on a plateau, when the number of examples is too small to properly estimate the coefficients of the nonlinear part. When trained on simple rules, we find that SVMs overfit only weakly. The performance of SVMs is strongly enhanced, when the distribution of the inputs has a gap in feature space.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Thouless-Anderson-Palmer (TAP) approach was originally developed for analysing the Sherrington-Kirkpatrick model in the study of spin glass models and has been employed since then mainly in the context of extensively connected systems whereby each dynamical variable interacts weakly with the others. Recently, we extended this method for handling general intensively connected systems where each variable has only O(1) connections characterised by strong couplings. However, the new formulation looks quite different with respect to existing analyses and it is only natural to question whether it actually reproduces known results for systems of extensive connectivity. In this chapter, we apply our formulation of the TAP approach to an extensively connected system, the Hopfield associative memory model, showing that it produces identical results to those obtained by the conventional formulation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we explore the practical use of neural networks for controlling complex non-linear systems. The system used to demonstrate this approach is a simulation of a gas turbine engine typical of those used to power commercial aircraft. The novelty of the work lies in the requirement for multiple controllers which are used to maintain system variables in safe operating regions as well as governing the engine thrust.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Understanding a complex network's structure holds the key to understanding its function. The physics community has contributed a multitude of methods and analyses to this cross-disciplinary endeavor. Structural features exist on both the microscopic level, resulting from differences between single node properties, and the mesoscopic level resulting from properties shared by groups of nodes. Disentangling the determinants of network structure on these different scales has remained a major, and so far unsolved, challenge. Here we show how multiscale generative probabilistic exponential random graph models combined with efficient, distributive message-passing inference techniques can be used to achieve this separation of scales, leading to improved detection accuracy of latent classes as demonstrated on benchmark problems. It sheds new light on the statistical significance of motif-distributions in neural networks and improves the link-prediction accuracy as exemplified for gene-disease associations in the highly consequential Online Mendelian Inheritance in Man database. © 2011 Reichardt et al.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Satellite-borne scatterometers are used to measure backscattered micro-wave radiation from the ocean surface. This data may be used to infer surface wind vectors where no direct measurements exist. Inherent in this data are outliers owing to aberrations on the water surface and measurement errors within the equipment. We present two techniques for identifying outliers using neural networks; the outliers may then be removed to improve models derived from the data. Firstly the generative topographic mapping (GTM) is used to create a probability density model; data with low probability under the model may be classed as outliers. In the second part of the paper, a sensor model with input-dependent noise is used and outliers are identified based on their probability under this model. GTM was successfully modified to incorporate prior knowledge of the shape of the observation manifold; however, GTM could not learn the double skinned nature of the observation manifold. To learn this double skinned manifold necessitated the use of a sensor model which imposes strong constraints on the mapping. The results using GTM with a fixed noise level suggested the noise level may vary as a function of wind speed. This was confirmed by experiments using a sensor model with input-dependent noise, where the variation in noise is most sensitive to the wind speed input. Both models successfully identified gross outliers with the largest differences between models occurring at low wind speeds. © 2003 Elsevier Science Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

It is generally assumed when using Bayesian inference methods for neural networks that the input data contains no noise. For real-world (errors in variable) problems this is clearly an unsafe assumption. This paper presents a Bayesian neural network framework which accounts for input noise provided that a model of the noise process exists. In the limit where the noise process is small and symmetric it is shown, using the Laplace approximation, that this method adds an extra term to the usual Bayesian error bar which depends on the variance of the input noise process. Further, by treating the true (noiseless) input as a hidden variable, and sampling this jointly with the network’s weights, using a Markov chain Monte Carlo method, it is demonstrated that it is possible to infer the regression over the noiseless input. This leads to the possibility of training an accurate model of a system using less accurate, or more uncertain, data. This is demonstrated on both the, synthetic, noisy sine wave problem and a real problem of inferring the forward model for a satellite radar backscatter system used to predict sea surface wind vectors.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis initially presents an 'assay' of the literature pertaining to individual differences in human-computer interaction. A series of experiments is then reported, designed to investigate the association between a variety of individual characteristics and various computer task and interface factors. Predictor variables included age, computer expertise, and psychometric tests of spatial visualisation, spatial memory, logical reasoning, associative memory, and verbal ability. These were studied in relation to a variety of computer-based tacks, including: (1) word processing and its component elements; (ii) the location of target words within passages of text; (iii) the navigation of networks and menus; (iv) command generation using menus and command line interfaces; (v) the search and selection of icons and text labels; (vi) information retrieval. A measure of self-report workload was also included in several of these experiments. The main experimental findings included: (i) an interaction between spatial ability and the manipulation of semantic but not spatial interface content; (ii) verbal ability being only predictive of certain task components of word processing; (iii) age differences in word processing and information retrieval speed but not accuracy; (iv) evidence of compensatory strategies being employed by older subjects; (v) evidence of performance strategy differences which disadvantaged high spatial subjects in conditions of low spatial information content; (vi) interactive effects of associative memory, expertise and command strategy; (vii) an association between logical reasoning and word processing but not information retrieval; (viii) an interaction between expertise and cognitive demand; and (ix) a stronger association between cognitive ability and novice performance than expert performance.