780 resultados para Neural Network Algorithm
Resumo:
It is well known that the addition of noise to the input data of a neural network during training can, in some circumstances, lead to significant improvements in generalization performance. Previous work has shown that such training with noise is equivalent to a form of regularization in which an extra term is added to the error function. However, the regularization term, which involves second derivatives of the error function, is not bounded below, and so can lead to difficulties if used directly in a learning algorithm based on error minimization. In this paper we show that, for the purposes of network training, the regularization term can be reduced to a positive definite form which involves only first derivatives of the network mapping. For a sum-of-squares error function, the regularization term belongs to the class of generalized Tikhonov regularizers. Direct minimization of the regularized error function provides a practical alternative to training with noise.
Resumo:
Introductory accounts of artificial neural networks often rely for motivation on analogies with models of information processing in biological networks. One limitation of such an approach is that it offers little guidance on how to find optimal algorithms, or how to verify the correct performance of neural network systems. A central goal of this paper is to draw attention to a quite different viewpoint in which neural networks are seen as algorithms for statistical pattern recognition based on a principled, i.e. theoretically well-founded, framework. We illustrate the concept of a principled viewpoint by considering a specific issue concerned with the interpretation of the outputs of a trained network. Finally, we discuss the relevance of such an approach to the issue of the validation and verification of neural network systems.
Resumo:
Neural networks have often been motivated by superficial analogy with biological nervous systems. Recently, however, it has become widely recognised that the effective application of neural networks requires instead a deeper understanding of the theoretical foundations of these models. Insight into neural networks comes from a number of fields including statistical pattern recognition, computational learning theory, statistics, information geometry and statistical mechanics. As an illustration of the importance of understanding the theoretical basis for neural network models, we consider their application to the solution of multi-valued inverse problems. We show how a naive application of the standard least-squares approach can lead to very poor results, and how an appreciation of the underlying statistical goals of the modelling process allows the development of a more general and more powerful formalism which can tackle the problem of multi-modality.
Resumo:
Deformable models are an attractive approach to recognizing objects which have considerable within-class variability such as handwritten characters. However, there are severe search problems associated with fitting the models to data which could be reduced if a better starting point for the search were available. We show that by training a neural network to predict how a deformable model should be instantiated from an input image, such improved starting points can be obtained. This method has been implemented for a system that recognizes handwritten digits using deformable models, and the results show that the search time can be significantly reduced without compromising recognition performance. © 1997 Academic Press.
Resumo:
It is well known that one of the obstacles to effective forecasting of exchange rates is heteroscedasticity (non-stationary conditional variance). The autoregressive conditional heteroscedastic (ARCH) model and its variants have been used to estimate a time dependent variance for many financial time series. However, such models are essentially linear in form and we can ask whether a non-linear model for variance can improve results just as non-linear models (such as neural networks) for the mean have done. In this paper we consider two neural network models for variance estimation. Mixture Density Networks (Bishop 1994, Nix and Weigend 1994) combine a Multi-Layer Perceptron (MLP) and a mixture model to estimate the conditional data density. They are trained using a maximum likelihood approach. However, it is known that maximum likelihood estimates are biased and lead to a systematic under-estimate of variance. More recently, a Bayesian approach to parameter estimation has been developed (Bishop and Qazaz 1996) that shows promise in removing the maximum likelihood bias. However, up to now, this model has not been used for time series prediction. Here we compare these algorithms with two other models to provide benchmark results: a linear model (from the ARIMA family), and a conventional neural network trained with a sum-of-squares error function (which estimates the conditional mean of the time series with a constant variance noise model). This comparison is carried out on daily exchange rate data for five currencies.
Resumo:
In developing neural network techniques for real world applications it is still very rare to see estimates of confidence placed on the neural network predictions. This is a major deficiency, especially in safety-critical systems. In this paper we explore three distinct methods of producing point-wise confidence intervals using neural networks. We compare and contrast Bayesian, Gaussian Process and Predictive error bars evaluated on real data. The problem domain is concerned with the calibration of a real automotive engine management system for both air-fuel ratio determination and on-line ignition timing. This problem requires real-time control and is a good candidate for exploring the use of confidence predictions due to its safety-critical nature.
Resumo:
The ERS-1 Satellite was launched in July 1991 by the European Space Agency into a polar orbit at about km800, carrying a C-band scatterometer. A scatterometer measures the amount of radar back scatter generated by small ripples on the ocean surface induced by instantaneous local winds. Operational methods that extract wind vectors from satellite scatterometer data are based on the local inversion of a forward model, mapping scatterometer observations to wind vectors, by the minimisation of a cost function in the scatterometer measurement space.par This report uses mixture density networks, a principled method for modelling conditional probability density functions, to model the joint probability distribution of the wind vectors given the satellite scatterometer measurements in a single cell (the `inverse' problem). The complexity of the mapping and the structure of the conditional probability density function are investigated by varying the number of units in the hidden layer of the multi-layer perceptron and the number of kernels in the Gaussian mixture model of the mixture density network respectively. The optimal model for networks trained per trace has twenty hidden units and four kernels. Further investigation shows that models trained with incidence angle as an input have results comparable to those models trained by trace. A hybrid mixture density network that incorporates geophysical knowledge of the problem confirms other results that the conditional probability distribution is dominantly bimodal.par The wind retrieval results improve on previous work at Aston, but do not match other neural network techniques that use spatial information in the inputs, which is to be expected given the ambiguity of the inverse problem. Current work uses the local inverse model for autonomous ambiguity removal in a principled Bayesian framework. Future directions in which these models may be improved are given.
Resumo:
A novel approach, based on statistical mechanics, to analyze typical performance of optimum code-division multiple-access (CDMA) multiuser detectors is reviewed. A `black-box' view ot the basic CDMA channel is introduced, based on which the CDMA multiuser detection problem is regarded as a `learning-from-examples' problem of the `binary linear perceptron' in the neural network literature. Adopting Bayes framework, analysis of the performance of the optimum CDMA multiuser detectors is reduced to evaluation of the average of the cumulant generating function of a relevant posterior distribution. The evaluation of the average cumulant generating function is done, based on formal analogy with a similar calculation appearing in the spin glass theory in statistical mechanics, by making use of the replica method, a method developed in the spin glass theory.
Resumo:
This paper presents a general methodology for estimating and incorporating uncertainty in the controller and forward models for noisy nonlinear control problems. Conditional distribution modeling in a neural network context is used to estimate uncertainty around the prediction of neural network outputs. The developed methodology circumvents the dynamic programming problem by using the predicted neural network uncertainty to localize the possible control solutions to consider. A nonlinear multivariable system with different delays between the input-output pairs is used to demonstrate the successful application of the developed control algorithm. The proposed method is suitable for redundant control systems and allows us to model strongly non Gaussian distributions of control signal as well as processes with hysteresis.
Resumo:
We consider the direct adaptive inverse control of nonlinear multivariable systems with different delays between every input-output pair. In direct adaptive inverse control, the inverse mapping is learned from examples of input-output pairs. This makes the obtained controller sub optimal, since the network may have to learn the response of the plant over a larger operational range than necessary. Moreover, in certain applications, the control problem can be redundant, implying that the inverse problem is ill posed. In this paper we propose a new algorithm which allows estimating and exploiting uncertainty in nonlinear multivariable control systems. This approach allows us to model strongly non-Gaussian distribution of control signals as well as processes with hysteresis. The proposed algorithm circumvents the dynamic programming problem by using the predicted neural network uncertainty to localise the possible control solutions to consider.
Resumo:
Introductory accounts of artificial neural networks often rely for motivation on analogies with models of information processing in biological networks. One limitation of such an approach is that it offers little guidance on how to find optimal algorithms, or how to verify the correct performance of neural network systems. A central goal of this paper is to draw attention to a quite different viewpoint in which neural networks are seen as algorithms for statistical pattern recognition based on a principled, i.e. theoretically well-founded, framework. We illustrate the concept of a principled viewpoint by considering a specific issue concerned with the interpretation of the outputs of a trained network. Finally, we discuss the relevance of such an approach to the issue of the validation and verification of neural network systems.
Resumo:
The sudden loss of the plasma magnetic confinement, known as disruption, is one of the major issue in a nuclear fusion machine as JET (Joint European Torus), Disruptions pose very serious problems to the safety of the machine. The energy stored in the plasma is released to the machine structure in few milliseconds resulting in forces that at JET reach several Mega Newtons. The problem is even more severe in the nuclear fusion power station where the forces are in the order of one hundred Mega Newtons. The events that occur during a disruption are still not well understood even if some mechanisms that can lead to a disruption have been identified and can be used to predict them. Unfortunately it is always a combination of these events that generates a disruption and therefore it is not possible to use simple algorithms to predict it. This thesis analyses the possibility of using neural network algorithms to predict plasma disruptions in real time. This involves the determination of plasma parameters every few milliseconds. A plasma boundary reconstruction algorithm, XLOC, has been developed in collaboration with Dr. D. Ollrien and Dr. J. Ellis capable of determining the plasma wall/distance every 2 milliseconds. The XLOC output has been used to develop a multilayer perceptron network to determine plasma parameters as ?i and q? with which a machine operational space has been experimentally defined. If the limits of this operational space are breached the disruption probability increases considerably. Another approach for prediction disruptions is to use neural network classification methods to define the JET operational space. Two methods have been studied. The first method uses a multilayer perceptron network with softmax activation function for the output layer. This method can be used for classifying the input patterns in various classes. In this case the plasma input patterns have been divided between disrupting and safe patterns, giving the possibility of assigning a disruption probability to every plasma input pattern. The second method determines the novelty of an input pattern by calculating the probability density distribution of successful plasma patterns that have been run at JET. The density distribution is represented as a mixture distribution, and its parameters arc determined using the Expectation-Maximisation method. If the dataset, used to determine the distribution parameters, covers sufficiently well the machine operational space. Then, the patterns flagged as novel can be regarded as patterns belonging to a disrupting plasma. Together with these methods, a network has been designed to predict the vertical forces, that a disruption can cause, in order to avoid that too dangerous plasma configurations are run. This network can be run before the pulse using the pre-programmed plasma configuration or on line becoming a tool that allows to stop dangerous plasma configuration. All these methods have been implemented in real time on a dual Pentium Pro based machine. The Disruption Prediction and Prevention System has shown that internal plasma parameters can be determined on-line with a good accuracy. Also the disruption detection algorithms showed promising results considering the fact that JET is an experimental machine where always new plasma configurations are tested trying to improve its performances.
Resumo:
This thesis considers two basic aspects of impact damage in composite materials, namely damage severity discrimination and impact damage location by using Acoustic Emissions (AE) and Artificial Neural Networks (ANNs). The experimental work embodies a study of such factors as the application of AE as Non-destructive Damage Testing (NDT), and the evaluation of ANNs modelling. ANNs, however, played an important role in modelling implementation. In the first aspect of the study, different impact energies were used to produce different level of damage in two composite materials (T300/914 and T800/5245). The impacts were detected by their acoustic emissions (AE). The AE waveform signals were analysed and modelled using a Back Propagation (BP) neural network model. The Mean Square Error (MSE) from the output was then used as a damage indicator in the damage severity discrimination study. To evaluate the ANN model, a comparison was made of the correlation coefficients of different parameters, such as MSE, AE energy, AE counts, etc. MSE produced an outstanding result based on the best performance of correlation. In the second aspect, a new artificial neural network model was developed to provide impact damage location on a quasi-isotropic composite panel. It was successfully trained to locate impact sites by correlating the relationship between arriving time differences of AE signals at transducers located on the panel and the impact site coordinates. The performance of the ANN model, which was evaluated by calculating the distance deviation between model output and real location coordinates, supports the application of ANN as an impact damage location identifier. In the study, the accuracy of location prediction decreased when approaching the central area of the panel. Further investigation indicated that this is due to the small arrival time differences, which defect the performance of ANN prediction. This research suggested increasing the number of processing neurons in the ANNs as a practical solution.
Resumo:
This thesis introduces and develops a novel real-time predictive maintenance system to estimate the machine system parameters using the motion current signature. Recently, motion current signature analysis has been addressed as an alternative to the use of sensors for monitoring internal faults of a motor. A maintenance system based upon the analysis of motion current signature avoids the need for the implementation and maintenance of expensive motion sensing technology. By developing nonlinear dynamical analysis for motion current signature, the research described in this thesis implements a novel real-time predictive maintenance system for current and future manufacturing machine systems. A crucial concept underpinning this project is that the motion current signature contains information relating to the machine system parameters and that this information can be extracted using nonlinear mapping techniques, such as neural networks. Towards this end, a proof of concept procedure is performed, which substantiates this concept. A simulation model, TuneLearn, is developed to simulate the large amount of training data required by the neural network approach. Statistical validation and verification of the model is performed to ascertain confidence in the simulated motion current signature. Validation experiment concludes that, although, the simulation model generates a good macro-dynamical mapping of the motion current signature, it fails to accurately map the micro-dynamical structure due to the lack of knowledge regarding performance of higher order and nonlinear factors, such as backlash and compliance. Failure of the simulation model to determine the micro-dynamical structure suggests the presence of nonlinearity in the motion current signature. This motivated us to perform surrogate data testing for nonlinearity in the motion current signature. Results confirm the presence of nonlinearity in the motion current signature, thereby, motivating the use of nonlinear techniques for further analysis. Outcomes of the experiment show that nonlinear noise reduction combined with the linear reverse algorithm offers precise machine system parameter estimation using the motion current signature for the implementation of the real-time predictive maintenance system. Finally, a linear reverse algorithm, BJEST, is developed and applied to the motion current signature to estimate the machine system parameters.
Resumo:
This thesis presents a thorough and principled investigation into the application of artificial neural networks to the biological monitoring of freshwater. It contains original ideas on the classification and interpretation of benthic macroinvertebrates, and aims to demonstrate their superiority over the biotic systems currently used in the UK to report river water quality. The conceptual basis of a new biological classification system is described, and a full review and analysis of a number of river data sets is presented. The biological classification is compared to the common biotic systems using data from the Upper Trent catchment. This data contained 292 expertly classified invertebrate samples identified to mixed taxonomic levels. The neural network experimental work concentrates on the classification of the invertebrate samples into biological class, where only a subset of the sample is used to form the classification. Other experimentation is conducted into the identification of novel input samples, the classification of samples from different biotopes and the use of prior information in the neural network models. The biological classification is shown to provide an intuitive interpretation of a graphical representation, generated without reference to the class labels, of the Upper Trent data. The selection of key indicator taxa is considered using three different approaches; one novel, one from information theory and one from classical statistical methods. Good indicators of quality class based on these analyses are found to be in good agreement with those chosen by a domain expert. The change in information associated with different levels of identification and enumeration of taxa is quantified. The feasibility of using neural network classifiers and predictors to develop numeric criteria for the biological assessment of sediment contamination in the Great Lakes is also investigated.