880 resultados para Neural networks training
Resumo:
Using methods of Statistical Physics, we investigate the generalization performance of support vector machines (SVMs), which have been recently introduced as a general alternative to neural networks. For nonlinear classification rules, the generalization error saturates on a plateau, when the number of examples is too small to properly estimate the coefficients of the nonlinear part. When trained on simple rules, we find that SVMs overfit only weakly. The performance of SVMs is strongly enhanced, when the distribution of the inputs has a gap in feature space.
Resumo:
In this paper we review recent theoretical approaches for analysing the dynamics of on-line learning in multilayer neural networks using methods adopted from statistical physics. The analysis is based on monitoring a set of macroscopic variables from which the generalisation error can be calculated. A closed set of dynamical equations for the macroscopic variables is derived analytically and solved numerically. The theoretical framework is then employed for defining optimal learning parameters and for analysing the incorporation of second order information into the learning process using natural gradient descent and matrix-momentum based methods. We will also briefly explain an extension of the original framework for analysing the case where training examples are sampled with repetition.
Resumo:
In this paper we explore the practical use of neural networks for controlling complex non-linear systems. The system used to demonstrate this approach is a simulation of a gas turbine engine typical of those used to power commercial aircraft. The novelty of the work lies in the requirement for multiple controllers which are used to maintain system variables in safe operating regions as well as governing the engine thrust.
Resumo:
Conventional feed forward Neural Networks have used the sum-of-squares cost function for training. A new cost function is presented here with a description length interpretation based on Rissanen's Minimum Description Length principle. It is a heuristic that has a rough interpretation as the number of data points fit by the model. Not concerned with finding optimal descriptions, the cost function prefers to form minimum descriptions in a naive way for computational convenience. The cost function is called the Naive Description Length cost function. Finding minimum description models will be shown to be closely related to the identification of clusters in the data. As a consequence the minimum of this cost function approximates the most probable mode of the data rather than the sum-of-squares cost function that approximates the mean. The new cost function is shown to provide information about the structure of the data. This is done by inspecting the dependence of the error to the amount of regularisation. This structure provides a method of selecting regularisation parameters as an alternative or supplement to Bayesian methods. The new cost function is tested on a number of multi-valued problems such as a simple inverse kinematics problem. It is also tested on a number of classification and regression problems. The mode-seeking property of this cost function is shown to improve prediction in time series problems. Description length principles are used in a similar fashion to derive a regulariser to control network complexity.
Resumo:
Understanding a complex network's structure holds the key to understanding its function. The physics community has contributed a multitude of methods and analyses to this cross-disciplinary endeavor. Structural features exist on both the microscopic level, resulting from differences between single node properties, and the mesoscopic level resulting from properties shared by groups of nodes. Disentangling the determinants of network structure on these different scales has remained a major, and so far unsolved, challenge. Here we show how multiscale generative probabilistic exponential random graph models combined with efficient, distributive message-passing inference techniques can be used to achieve this separation of scales, leading to improved detection accuracy of latent classes as demonstrated on benchmark problems. It sheds new light on the statistical significance of motif-distributions in neural networks and improves the link-prediction accuracy as exemplified for gene-disease associations in the highly consequential Online Mendelian Inheritance in Man database. © 2011 Reichardt et al.
Resumo:
Satellite-borne scatterometers are used to measure backscattered micro-wave radiation from the ocean surface. This data may be used to infer surface wind vectors where no direct measurements exist. Inherent in this data are outliers owing to aberrations on the water surface and measurement errors within the equipment. We present two techniques for identifying outliers using neural networks; the outliers may then be removed to improve models derived from the data. Firstly the generative topographic mapping (GTM) is used to create a probability density model; data with low probability under the model may be classed as outliers. In the second part of the paper, a sensor model with input-dependent noise is used and outliers are identified based on their probability under this model. GTM was successfully modified to incorporate prior knowledge of the shape of the observation manifold; however, GTM could not learn the double skinned nature of the observation manifold. To learn this double skinned manifold necessitated the use of a sensor model which imposes strong constraints on the mapping. The results using GTM with a fixed noise level suggested the noise level may vary as a function of wind speed. This was confirmed by experiments using a sensor model with input-dependent noise, where the variation in noise is most sensitive to the wind speed input. Both models successfully identified gross outliers with the largest differences between models occurring at low wind speeds. © 2003 Elsevier Science Ltd. All rights reserved.
Resumo:
The main theme of research of this project concerns the study of neutral networks to control uncertain and non-linear control systems. This involves the control of continuous time, discrete time, hybrid and stochastic systems with input, state or output constraints by ensuring good performances. A great part of this project is devoted to the opening of frontiers between several mathematical and engineering approaches in order to tackle complex but very common non-linear control problems. The objectives are: 1. Design and develop procedures for neutral network enhanced self-tuning adaptive non-linear control systems; 2. To design, as a general procedure, neural network generalised minimum variance self-tuning controller for non-linear dynamic plants (Integration of neural network mapping with generalised minimum variance self-tuning controller strategies); 3. To develop a software package to evaluate control system performances using Matlab, Simulink and Neural Network toolbox. An adaptive control algorithm utilising a recurrent network as a model of a partial unknown non-linear plant with unmeasurable state is proposed. Appropriately, it appears that structured recurrent neural networks can provide conveniently parameterised dynamic models for many non-linear systems for use in adaptive control. Properties of static neural networks, which enabled successful design of stable adaptive control in the state feedback case, are also identified. A survey of the existing results is presented which puts them in a systematic framework showing their relation to classical self-tuning adaptive control application of neural control to a SISO/MIMO control. Simulation results demonstrate that the self-tuning design methods may be practically applicable to a reasonably large class of unknown linear and non-linear dynamic control systems.
Resumo:
This thesis introduces and develops a novel real-time predictive maintenance system to estimate the machine system parameters using the motion current signature. Recently, motion current signature analysis has been addressed as an alternative to the use of sensors for monitoring internal faults of a motor. A maintenance system based upon the analysis of motion current signature avoids the need for the implementation and maintenance of expensive motion sensing technology. By developing nonlinear dynamical analysis for motion current signature, the research described in this thesis implements a novel real-time predictive maintenance system for current and future manufacturing machine systems. A crucial concept underpinning this project is that the motion current signature contains information relating to the machine system parameters and that this information can be extracted using nonlinear mapping techniques, such as neural networks. Towards this end, a proof of concept procedure is performed, which substantiates this concept. A simulation model, TuneLearn, is developed to simulate the large amount of training data required by the neural network approach. Statistical validation and verification of the model is performed to ascertain confidence in the simulated motion current signature. Validation experiment concludes that, although, the simulation model generates a good macro-dynamical mapping of the motion current signature, it fails to accurately map the micro-dynamical structure due to the lack of knowledge regarding performance of higher order and nonlinear factors, such as backlash and compliance. Failure of the simulation model to determine the micro-dynamical structure suggests the presence of nonlinearity in the motion current signature. This motivated us to perform surrogate data testing for nonlinearity in the motion current signature. Results confirm the presence of nonlinearity in the motion current signature, thereby, motivating the use of nonlinear techniques for further analysis. Outcomes of the experiment show that nonlinear noise reduction combined with the linear reverse algorithm offers precise machine system parameter estimation using the motion current signature for the implementation of the real-time predictive maintenance system. Finally, a linear reverse algorithm, BJEST, is developed and applied to the motion current signature to estimate the machine system parameters.
Resumo:
This paper introduces a mechanism for generating a series of rules that characterize the money price relationship for the USA, defined as the relationship between the rate of growth of the money supply and inflation. Monetary component data is used to train a selection of candidate feedforward neural networks. The selected network is mined for rules, expressed in human-readable and machine-executable form. The rule and network accuracy are compared, and expert commentary is made on the readability and reliability of the extracted rule set. The ultimate goal of this research is to produce rules that meaningfully and accurately describe inflation in terms of the monetary component dataset.
Resumo:
In this paper, we discuss some practical implications for implementing adaptable network algorithms applied to non-stationary time series problems. Two real world data sets, containing electricity load demands and foreign exchange market prices, are used to test several different methods, ranging from linear models with fixed parameters, to non-linear models which adapt both parameters and model order on-line. Training with the extended Kalman filter, we demonstrate that the dynamic model-order increment procedure of the resource allocating RBF network (RAN) is highly sensitive to the parameters of the novelty criterion. We investigate the use of system noise for increasing the plasticity of the Kalman filter training algorithm, and discuss the consequences for on-line model order selection. The results of our experiments show that there are advantages to be gained in tracking real world non-stationary data through the use of more complex adaptive models.
Resumo:
Background - The Met allele of the catechol-O-methyltransferase (COMT) valine-to-methionine (Val158Met) polymorphism is known to affect dopamine-dependent affective regulation within amygdala-prefrontal cortical (PFC) networks. It is also thought to increase the risk of a number of disorders characterized by affective morbidity including bipolar disorder (BD), major depressive disorder (MDD) and anxiety disorders. The disease risk conferred is small, suggesting that this polymorphism represents a modifier locus. Therefore our aim was to investigate how the COMT Val158Met may contribute to phenotypic variation in clinical diagnosis using sad facial affect processing as a probe for its neural action. Method - We employed functional magnetic resonance imaging to measure activation in the amygdala, ventromedial PFC (vmPFC) and ventrolateral PFC (vlPFC) during sad facial affect processing in family members with BD (n=40), MDD and anxiety disorders (n=22) or no psychiatric diagnosis (n=25) and 50 healthy controls. Results - Irrespective of clinical phenotype, the Val158 allele was associated with greater amygdala activation and the Met allele with greater signal change in the vmPFC and vlPFC. Signal changes in the amygdala and vmPFC were not associated with disease expression. However, in the right vlPFC the Met158 allele was associated with greater activation in all family members with affective morbidity compared with relatives without a psychiatric diagnosis and healthy controls. Conclusions - Our results suggest that the COMT Val158Met polymorphism has a pleiotropic effect within the neural networks subserving emotional processing. Furthermore the Met158 allele further reduces cortical efficiency in the vlPFC in individuals with affective morbidity. © 2010 Cambridge University Press.
Resumo:
Authors suggested earlier hierarchical method for definition of class description at pattern recognition problems solution. In this paper development and use of such hierarchical descriptions for parallel representation of complex patterns on the base of multi-core computers or neural networks is proposed.
Resumo:
When Recurrent Neural Networks (RNN) are going to be used as Pattern Recognition systems, the problem to be considered is how to impose prescribed prototype vectors ξ^1,ξ^2,...,ξ^p as fixed points. The synaptic matrix W should be interpreted as a sort of sign correlation matrix of the prototypes, In the classical approach. The weak point in this approach, comes from the fact that it does not have the appropriate tools to deal efficiently with the correlation between the state vectors and the prototype vectors The capacity of the net is very poor because one can only know if one given vector is adequately correlated with the prototypes or not and we are not able to know what its exact correlation degree. The interest of our approach lies precisely in the fact that it provides these tools. In this paper, a geometrical vision of the dynamic of states is explained. A fixed point is viewed as a point in the Euclidean plane R2. The retrieving procedure is analyzed trough statistical frequency distribution of the prototypes. The capacity of the net is improved and the spurious states are reduced. In order to clarify and corroborate the theoretical results, together with the formal theory, an application is presented
Resumo:
The paper is devoted to the description of hybrid pattern recognition method developed by research groups from Russia, Armenia and Spain. The method is based upon logical correction over the set of conventional neural networks. Output matrices of neural networks are processed according to the potentiality principle which allows increasing of recognition reliability.
Resumo:
In the present paper the problems of the optimal control of systems when constraints are imposed on the control is considered. The optimality conditions are given in the form of Pontryagin’s maximum principle. The obtained piecewise linear function is approximated by using feedforward neural network. A numerical example is given.