876 resultados para ensembles of artificial neural networks
Resumo:
The ERS-1 Satellite was launched in July 1991 by the European Space Agency into a polar orbit at about km800, carrying a C-band scatterometer. A scatterometer measures the amount of radar back scatter generated by small ripples on the ocean surface induced by instantaneous local winds. Operational methods that extract wind vectors from satellite scatterometer data are based on the local inversion of a forward model, mapping scatterometer observations to wind vectors, by the minimisation of a cost function in the scatterometer measurement space.par This report uses mixture density networks, a principled method for modelling conditional probability density functions, to model the joint probability distribution of the wind vectors given the satellite scatterometer measurements in a single cell (the `inverse' problem). The complexity of the mapping and the structure of the conditional probability density function are investigated by varying the number of units in the hidden layer of the multi-layer perceptron and the number of kernels in the Gaussian mixture model of the mixture density network respectively. The optimal model for networks trained per trace has twenty hidden units and four kernels. Further investigation shows that models trained with incidence angle as an input have results comparable to those models trained by trace. A hybrid mixture density network that incorporates geophysical knowledge of the problem confirms other results that the conditional probability distribution is dominantly bimodal.par The wind retrieval results improve on previous work at Aston, but do not match other neural network techniques that use spatial information in the inputs, which is to be expected given the ambiguity of the inverse problem. Current work uses the local inverse model for autonomous ambiguity removal in a principled Bayesian framework. Future directions in which these models may be improved are given.
Resumo:
Using methods of Statistical Physics, we investigate the generalization performance of support vector machines (SVMs), which have been recently introduced as a general alternative to neural networks. For nonlinear classification rules, the generalization error saturates on a plateau, when the number of examples is too small to properly estimate the coefficients of the nonlinear part. When trained on simple rules, we find that SVMs overfit only weakly. The performance of SVMs is strongly enhanced, when the distribution of the inputs has a gap in feature space.
Resumo:
On-line learning is one of the most powerful and commonly used techniques for training large layered networks and has been used successfully in many real-world applications. Traditional analytical methods have been recently complemented by ones from statistical physics and Bayesian statistics. This powerful combination of analytical methods provides more insight and deeper understanding of existing algorithms and leads to novel and principled proposals for their improvement. This book presents a coherent picture of the state-of-the-art in the theoretical analysis of on-line learning. An introduction relates the subject to other developments in neural networks and explains the overall picture. Surveys by leading experts in the field combine new and established material and enable non-experts to learn more about the techniques and methods used. This book, the first in the area, provides a comprehensive view of the subject and will be welcomed by mathematicians, scientists and engineers, whether in industry or academia.
Resumo:
In data visualization, characterizing local geometric properties of non-linear projection manifolds provides the user with valuable additional information that can influence further steps in the data analysis. We take advantage of the smooth character of GTM projection manifold and analytically calculate its local directional curvatures. Curvature plots are useful for detecting regions where geometry is distorted, for changing the amount of regularization in non-linear projection manifolds, and for choosing regions of interest when constructing detailed lower-level visualization plots.
Resumo:
A novel approach, based on statistical mechanics, to analyze typical performance of optimum code-division multiple-access (CDMA) multiuser detectors is reviewed. A `black-box' view ot the basic CDMA channel is introduced, based on which the CDMA multiuser detection problem is regarded as a `learning-from-examples' problem of the `binary linear perceptron' in the neural network literature. Adopting Bayes framework, analysis of the performance of the optimum CDMA multiuser detectors is reduced to evaluation of the average of the cumulant generating function of a relevant posterior distribution. The evaluation of the average cumulant generating function is done, based on formal analogy with a similar calculation appearing in the spin glass theory in statistical mechanics, by making use of the replica method, a method developed in the spin glass theory.
Resumo:
Mixture Density Networks are a principled method to model conditional probability density functions which are non-Gaussian. This is achieved by modelling the conditional distribution for each pattern with a Gaussian Mixture Model for which the parameters are generated by a neural network. This thesis presents a novel method to introduce regularisation in this context for the special case where the mean and variance of the spherical Gaussian Kernels in the mixtures are fixed to predetermined values. Guidelines for how these parameters can be initialised are given, and it is shown how to apply the evidence framework to mixture density networks to achieve regularisation. This also provides an objective stopping criteria that can replace the `early stopping' methods that have previously been used. If the neural network used is an RBF network with fixed centres this opens up new opportunities for improved initialisation of the network weights, which are exploited to start training relatively close to the optimum. The new method is demonstrated on two data sets. The first is a simple synthetic data set while the second is a real life data set, namely satellite scatterometer data used to infer the wind speed and wind direction near the ocean surface. For both data sets the regularisation method performs well in comparison with earlier published results. Ideas on how the constraint on the kernels may be relaxed to allow fully adaptable kernels are presented.
Resumo:
A number of researchers have investigated the application of neural networks to visual recognition, with much of the emphasis placed on exploiting the network's ability to generalise. However, despite the benefits of such an approach it is not at all obvious how networks can be developed which are capable of recognising objects subject to changes in rotation, translation and viewpoint. In this study, we suggest that a possible solution to this problem can be found by studying aspects of visual psychology and in particular, perceptual organisation. For example, it appears that grouping together lines based upon perceptually significant features can facilitate viewpoint independent recognition. The work presented here identifies simple grouping measures based on parallelism and connectivity and shows how it is possible to train multi-layer perceptrons (MLPs) to detect and determine the perceptual significance of any group presented. In this way, it is shown how MLPs which are trained via backpropagation to perform individual grouping tasks, can be brought together into a novel, large scale network capable of determining the perceptual significance of the whole input pattern. Finally the applicability of such significance values for recognition is investigated and results indicate that both the NILP and the Kohonen Feature Map can be trained to recognise simple shapes described in terms of perceptual significances. This study has also provided an opportunity to investigate aspects of the backpropagation algorithm, particularly the ability to generalise. In this study we report the results of various generalisation tests. In applying the backpropagation algorithm to certain problems, we found that there was a deficiency in performance with the standard learning algorithm. An improvement in performance could however, be obtained when suitable modifications were made to the algorithm. The modifications and consequent results are reported here.
Resumo:
The main theme of research of this project concerns the study of neutral networks to control uncertain and non-linear control systems. This involves the control of continuous time, discrete time, hybrid and stochastic systems with input, state or output constraints by ensuring good performances. A great part of this project is devoted to the opening of frontiers between several mathematical and engineering approaches in order to tackle complex but very common non-linear control problems. The objectives are: 1. Design and develop procedures for neutral network enhanced self-tuning adaptive non-linear control systems; 2. To design, as a general procedure, neural network generalised minimum variance self-tuning controller for non-linear dynamic plants (Integration of neural network mapping with generalised minimum variance self-tuning controller strategies); 3. To develop a software package to evaluate control system performances using Matlab, Simulink and Neural Network toolbox. An adaptive control algorithm utilising a recurrent network as a model of a partial unknown non-linear plant with unmeasurable state is proposed. Appropriately, it appears that structured recurrent neural networks can provide conveniently parameterised dynamic models for many non-linear systems for use in adaptive control. Properties of static neural networks, which enabled successful design of stable adaptive control in the state feedback case, are also identified. A survey of the existing results is presented which puts them in a systematic framework showing their relation to classical self-tuning adaptive control application of neural control to a SISO/MIMO control. Simulation results demonstrate that the self-tuning design methods may be practically applicable to a reasonably large class of unknown linear and non-linear dynamic control systems.
Resumo:
Neuroimaging literature has identified several regions involved in encoding and recognition processes. A review of the literature illustrated considerable variations in the precise location and mechanisms of these processes, and it was these variations that were investigated in the studies in this thesis. Magnetoencephalography (MEG) was used as the neuroimaging tool and a preliminary study identified Synthetic Aperture Magnetometry (SAM) and not a traditional dipole fitting technique, as an appropriate tool for identifying the multiple cortical regions involved in recognition memory. It has been suggested that there is hemispheric asymmetry in encoding and recognition processes. There are two main hypotheses: the first suggesting that there is task-specificity, the second that this specificity is determined by stimulus modality. A series of experiments was completed with two main aims: first to produce consistent and complementary recognition memory data with MEG, and second to determine whether there exists any hemispheric asymmetry in recognition memory. The results obtained from five experiments demonstrated activation of prefrontal and middle temporal structures, which were consistent with those reported in previous neuroimaging studies. It was suggested that this diverse activation may be explained by the involvement of a semantic network during recognition memory processes. In support of this, a subsequent study involving a semantic encoding task demonstrated that category-specific differences in cortical activation also existed in the recognition memory phase. Controlling for the involvement of such semantic processes produced predominantly bilateral activation. It was suggested that the apparent hemispheric asymmetry findings reported in the literature may be due to the 'coarse' temporal analysis available with earlier imaging techniques, which over-simplified the networks reported by being unable to recognise the early complex processes associated with semantic processing which these MEG studies were able to identify. The importance of frequency-specific activations, specifically theta synchronisation and alpha desynchronisation, in memory processes was also investigated.
Resumo:
Fibre Bragg grating sensors are usually expensive to interrogate, and part of this thesis describes a low cost interrogation system for a group of such devices which can be indefinitely scaled up for larger numbers of sensors without requiring an increasingly broadband light source. It incorporates inherent temperature correction and also uses fewer photodiodes than the number or sensors it interrogates, using neural networks to interpret the photodiode data. A novel sensing arrangement using an FBG grating encapsulated in a silicone polymer is presented. This sensor is capable of distinguishing between different surface profiles with ridges 0.5 to 1mm deep and 2mm pitch and either triangular, semicircular or square in profile. Early experiments using neural networks to distinguish between these profiles are also presented. The potential applications for tactile sensing systems incorporating fibre Bragg gratings and neural networks are explored.
Resumo:
This paper compares the UK/US exchange rate forecasting performance of linear and nonlinear models based on monetary fundamentals, to a random walk (RW) model. Structural breaks are identified and taken into account. The exchange rate forecasting framework is also used for assessing the relative merits of the official Simple Sum and the weighted Divisia measures of money. Overall, there are four main findings. First, the majority of the models with fundamentals are able to beat the RW model in forecasting the UK/US exchange rate. Second, the most accurate forecasts of the UK/US exchange rate are obtained with a nonlinear model. Third, taking into account structural breaks reveals that the Divisia aggregate performs better than its Simple Sum counterpart. Finally, Divisia-based models provide more accurate forecasts than Simple Sum-based models provided they are constructed within a nonlinear framework.
Resumo:
A recent novel approach to the visualisation and analysis of datasets, and one which is particularly applicable to those of a high dimension, is discussed in the context of real applications. A feed-forward neural network is utilised to effect a topographic, structure-preserving, dimension-reducing transformation of the data, with an additional facility to incorporate different degrees of associated subjective information. The properties of this transformation are illustrated on synthetic and real datasets, including the 1992 UK Research Assessment Exercise for funding in higher education. The method is compared and contrasted to established techniques for feature extraction, and related to topographic mappings, the Sammon projection and the statistical field of multidimensional scaling.
Resumo:
The paper is devoted to the description of hybrid pattern recognition method developed by research groups from Russia, Armenia and Spain. The method is based upon logical correction over the set of conventional neural networks. Output matrices of neural networks are processed according to the potentiality principle which allows increasing of recognition reliability.
Resumo:
Special generalizing for the artificial neural nets: so called RFT – FN – is under discussion in the report. Such refinement touch upon the constituent elements for the conception of artificial neural network, namely, the choice of main primary functional elements in the net, the way to connect them(topology) and the structure of the net as a whole. As to the last, the structure of the functional net proposed is determined dynamically just in the constructing the net by itself by the special recurrent procedure. The number of newly joining primary functional elements, the topology of its connecting and tuning of the primary elements is the content of the each recurrent step. The procedure is terminated under fulfilling “natural” criteria relating residuals for example. The functional proposed can be used in solving the approximation problem for the functions, represented by its observations, for classifying and clustering, pattern recognition, etc. Recurrent procedure provide for the versatile optimizing possibilities: as on the each step of the procedure and wholly: by the choice of the newly joining elements, topology, by the affine transformations if input and intermediate coordinate as well as by its nonlinear coordinate wise transformations. All considerations are essentially based, constructively and evidently represented by the means of the Generalized Inverse.
Resumo:
This paper presents an analysis of different techniques that is designed to aid a researcher in determining which of the classification techniques would be most appropriate to choose the ridge, robust and linear regression methods for predicting outcomes for specific quasi-stationary process.