829 resultados para Adaptive Equalization. Neural Networks. Optic Systems. Neural Equalizer
Resumo:
An adaptive back-propagation algorithm is studied and compared with gradient descent (standard back-propagation) for on-line learning in two-layer neural networks with an arbitrary number of hidden units. Within a statistical mechanics framework, both numerical studies and a rigorous analysis show that the adaptive back-propagation method results in faster training by breaking the symmetry between hidden units more efficiently and by providing faster convergence to optimal generalization than gradient descent.
Resumo:
We study the effect of two types of noise, data noise and model noise, in an on-line gradient-descent learning scenario for general two-layer student network with an arbitrary number of hidden units. Training examples are randomly drawn input vectors labeled by a two-layer teacher network with an arbitrary number of hidden units. Data is then corrupted by Gaussian noise affecting either the output or the model itself. We examine the effect of both types of noise on the evolution of order parameters and the generalization error in various phases of the learning process.
Resumo:
Neural networks have often been motivated by superficial analogy with biological nervous systems. Recently, however, it has become widely recognised that the effective application of neural networks requires instead a deeper understanding of the theoretical foundations of these models. Insight into neural networks comes from a number of fields including statistical pattern recognition, computational learning theory, statistics, information geometry and statistical mechanics. As an illustration of the importance of understanding the theoretical basis for neural network models, we consider their application to the solution of multi-valued inverse problems. We show how a naive application of the standard least-squares approach can lead to very poor results, and how an appreciation of the underlying statistical goals of the modelling process allows the development of a more general and more powerful formalism which can tackle the problem of multi-modality.
Resumo:
We complement recent advances in thermodynamic limit analyses of mean on-line gradient descent learning dynamics in multi-layer networks by calculating fluctuations possessed by finite dimensional systems. Fluctuations from the mean dynamics are largest at the onset of specialisation as student hidden unit weight vectors begin to imitate specific teacher vectors, increasing with the degree of symmetry of the initial conditions. In light of this, we include a term to stimulate asymmetry in the learning process, which typically also leads to a significant decrease in training time.
Resumo:
In developing neural network techniques for real world applications it is still very rare to see estimates of confidence placed on the neural network predictions. This is a major deficiency, especially in safety-critical systems. In this paper we explore three distinct methods of producing point-wise confidence intervals using neural networks. We compare and contrast Bayesian, Gaussian Process and Predictive error bars evaluated on real data. The problem domain is concerned with the calibration of a real automotive engine management system for both air-fuel ratio determination and on-line ignition timing. This problem requires real-time control and is a good candidate for exploring the use of confidence predictions due to its safety-critical nature.
Resumo:
Introductory accounts of artificial neural networks often rely for motivation on analogies with models of information processing in biological networks. One limitation of such an approach is that it offers little guidance on how to find optimal algorithms, or how to verify the correct performance of neural network systems. A central goal of this paper is to draw attention to a quite different viewpoint in which neural networks are seen as algorithms for statistical pattern recognition based on a principled, i.e. theoretically well-founded, framework. We illustrate the concept of a principled viewpoint by considering a specific issue concerned with the interpretation of the outputs of a trained network. Finally, we discuss the relevance of such an approach to the issue of the validation and verification of neural network systems.
Resumo:
The assessment of the reliability of systems which learn from data is a key issue to investigate thoroughly before the actual application of information processing techniques to real-world problems. Over the recent years Gaussian processes and Bayesian neural networks have come to the fore and in this thesis their generalisation capabilities are analysed from theoretical and empirical perspectives. Upper and lower bounds on the learning curve of Gaussian processes are investigated in order to estimate the amount of data required to guarantee a certain level of generalisation performance. In this thesis we analyse the effects on the bounds and the learning curve induced by the smoothness of stochastic processes described by four different covariance functions. We also explain the early, linearly-decreasing behaviour of the curves and we investigate the asymptotic behaviour of the upper bounds. The effect of the noise and the characteristic lengthscale of the stochastic process on the tightness of the bounds are also discussed. The analysis is supported by several numerical simulations. The generalisation error of a Gaussian process is affected by the dimension of the input vector and may be decreased by input-variable reduction techniques. In conventional approaches to Gaussian process regression, the positive definite matrix estimating the distance between input points is often taken diagonal. In this thesis we show that a general distance matrix is able to estimate the effective dimensionality of the regression problem as well as to discover the linear transformation from the manifest variables to the hidden-feature space, with a significant reduction of the input dimension. Numerical simulations confirm the significant superiority of the general distance matrix with respect to the diagonal one.In the thesis we also present an empirical investigation of the generalisation errors of neural networks trained by two Bayesian algorithms, the Markov Chain Monte Carlo method and the evidence framework; the neural networks have been trained on the task of labelling segmented outdoor images.
Resumo:
The scaling problems which afflict attempts to optimise neural networks (NNs) with genetic algorithms (GAs) are disclosed. A novel GA-NN hybrid is introduced, based on the bumptree, a little-used connectionist model. As well as being computationally efficient, the bumptree is shown to be more amenable to genetic coding lthan other NN models. A hierarchical genetic coding scheme is developed for the bumptree and shown to have low redundancy, as well as being complete and closed with respect to the search space. When applied to optimising bumptree architectures for classification problems the GA discovers bumptrees which significantly out-perform those constructed using a standard algorithm. The fields of artificial life, control and robotics are identified as likely application areas for the evolutionary optimisation of NNs. An artificial life case-study is presented and discussed. Experiments are reported which show that the GA-bumptree is able to learn simulated pole balancing and car parking tasks using only limited environmental feedback. A simple modification of the fitness function allows the GA-bumptree to learn mappings which are multi-modal, such as robot arm inverse kinematics. The dynamics of the 'geographic speciation' selection model used by the GA-bumptree are investigated empirically and the convergence profile is introduced as an analytical tool. The relationships between the rate of genetic convergence and the phenomena of speciation, genetic drift and punctuated equilibrium arc discussed. The importance of genetic linkage to GA design is discussed and two new recombination operators arc introduced. The first, linkage mapped crossover (LMX) is shown to be a generalisation of existing crossover operators. LMX provides a new framework for incorporating prior knowledge into GAs.Its adaptive form, ALMX, is shown to be able to infer linkage relationships automatically during genetic search.
Resumo:
This thesis presents a thorough and principled investigation into the application of artificial neural networks to the biological monitoring of freshwater. It contains original ideas on the classification and interpretation of benthic macroinvertebrates, and aims to demonstrate their superiority over the biotic systems currently used in the UK to report river water quality. The conceptual basis of a new biological classification system is described, and a full review and analysis of a number of river data sets is presented. The biological classification is compared to the common biotic systems using data from the Upper Trent catchment. This data contained 292 expertly classified invertebrate samples identified to mixed taxonomic levels. The neural network experimental work concentrates on the classification of the invertebrate samples into biological class, where only a subset of the sample is used to form the classification. Other experimentation is conducted into the identification of novel input samples, the classification of samples from different biotopes and the use of prior information in the neural network models. The biological classification is shown to provide an intuitive interpretation of a graphical representation, generated without reference to the class labels, of the Upper Trent data. The selection of key indicator taxa is considered using three different approaches; one novel, one from information theory and one from classical statistical methods. Good indicators of quality class based on these analyses are found to be in good agreement with those chosen by a domain expert. The change in information associated with different levels of identification and enumeration of taxa is quantified. The feasibility of using neural network classifiers and predictors to develop numeric criteria for the biological assessment of sediment contamination in the Great Lakes is also investigated.
Resumo:
Fibre Bragg grating sensors are usually expensive to interrogate, and part of this thesis describes a low cost interrogation system for a group of such devices which can be indefinitely scaled up for larger numbers of sensors without requiring an increasingly broadband light source. It incorporates inherent temperature correction and also uses fewer photodiodes than the number or sensors it interrogates, using neural networks to interpret the photodiode data. A novel sensing arrangement using an FBG grating encapsulated in a silicone polymer is presented. This sensor is capable of distinguishing between different surface profiles with ridges 0.5 to 1mm deep and 2mm pitch and either triangular, semicircular or square in profile. Early experiments using neural networks to distinguish between these profiles are also presented. The potential applications for tactile sensing systems incorporating fibre Bragg gratings and neural networks are explored.
Resumo:
Improving bit error rates in optical communication systems is a difficult and important problem. The error correction must take place at high speed and be extremely accurate. We show the feasibility of using hardware implementable machine learning techniques. This may enable some error correction at the speed required.
Resumo:
Optical data communication systems are prone to a variety of processes that modify the transmitted signal, and contribute errors in the determination of 1s from 0s. This is a difficult, and commercially important, problem to solve. Errors must be detected and corrected at high speed, and the classifier must be very accurate; ideally it should also be tunable to the characteristics of individual communication links. We show that simple single layer neural networks may be used to address these problems, and examine how different input representations affect the accuracy of bit error correction. Our results lead us to conclude that a system based on these principles can perform at least as well as an existing non-trainable error correction system, whilst being tunable to suit the individual characteristics of different communication links.
Resumo:
Optical data communication systems are prone to a variety of processes that modify the transmitted signal, and contribute errors in the determination of 1s from 0s. This is a difficult, and commercially important, problem to solve. Errors must be detected and corrected at high speed, and the classifier must be very accurate; ideally it should also be tunable to the characteristics of individual communication links. We show that simple single layer neural networks may be used to address these problems, and examine how different input representations affect the accuracy of bit error correction. Our results lead us to conclude that a system based on these principles can perform at least as well as an existing non-trainable error correction system, whilst being tunable to suit the individual characteristics of different communication links.
Resumo:
Improving bit error rates in optical communication systems is a difficult and important problem. The error correction must take place at high speed and be extremely accurate. We show the feasibility of using hardware implementable machine learning techniques. This may enable some error correction at the speed required.
Resumo:
As is well known, the Convergence Theorem for the Recurrent Neural Networks, is based in Lyapunov ́s second method, which states that associated to any one given net state, there always exist a real number, in other words an element of the one dimensional Euclidean Space R, in such a way that when the state of the net changes then its associated real number decreases. In this paper we will introduce the two dimensional Euclidean space R2, as the space associated to the net, and we will define a pair of real numbers ( x, y ) , associated to any one given state of the net. We will prove that when the net change its state, then the product x ⋅ y will decrease. All the states whose projection over the energy field are placed on the same hyperbolic surface, will be considered as points with the same energy level. On the other hand we will prove that if the states are classified attended to their distances to the zero vector, only one pattern in each one of the different classes may be at the same energy level. The retrieving procedure is analyzed trough the projection of the states on that plane. The geometrical properties of the synaptic matrix W may be used for classifying the n-dimensional state- vector space in n classes. A pattern to be recognized is seen as a point belonging to one of these classes, and depending on the class the pattern to be retrieved belongs, different weight parameters are used. The capacity of the net is improved and the spurious states are reduced. In order to clarify and corroborate the theoretical results, together with the formal theory, an application is presented.