795 resultados para Multilayer perceptron neural networks
Resumo:
This paper presents models that can be used in the design of microstrip antennas for mobile communications. The antennas can be triangular or rectangular. The presented models are compared with deterministic and empirical models based on artificial neural networks (ANN) presented in the literature. The models are based on Perceptron Multilayer (PML) and Radial Basis Function (RBF) ANN. RBF based models presented the best results. Also, the models can be embedded in CAD systems, in order to design microstrip antennas for mobile communications.
Resumo:
Abstract This paper presents a new method to extract knowledge from existing data sets, that is, to extract symbolic rules using the weights of an Artificial Neural Network. The method has been applied to a neural network with special architecture named Enhanced Neural Network (ENN). This architecture improves the results that have been obtained with multilayer perceptron (MLP). The relationship among the knowledge stored in the weights, the performance of the network and the new implemented algorithm to acquire rules from the weights is explained. The method itself gives a model to follow in the knowledge acquisition with ENN.
Resumo:
The expectation-maximization (EM) algorithm has been of considerable interest in recent years as the basis for various algorithms in application areas of neural networks such as pattern recognition. However, there exists some misconceptions concerning its application to neural networks. In this paper, we clarify these misconceptions and consider how the EM algorithm can be adopted to train multilayer perceptron (MLP) and mixture of experts (ME) networks in applications to multiclass classification. We identify some situations where the application of the EM algorithm to train MLP networks may be of limited value and discuss some ways of handling the difficulties. For ME networks, it is reported in the literature that networks trained by the EM algorithm using iteratively reweighted least squares (IRLS) algorithm in the inner loop of the M-step, often performed poorly in multiclass classification. However, we found that the convergence of the IRLS algorithm is stable and that the log likelihood is monotonic increasing when a learning rate smaller than one is adopted. Also, we propose the use of an expectation-conditional maximization (ECM) algorithm to train ME networks. Its performance is demonstrated to be superior to the IRLS algorithm on some simulated and real data sets.
Resumo:
We consider the problem of on-line gradient descent learning for general two-layer neural networks. An analytic solution is presented and used to investigate the role of the learning rate in controlling the evolution and convergence of the learning process.
Resumo:
We present an analytic solution to the problem of on-line gradient-descent learning for two-layer neural networks with an arbitrary number of hidden units in both teacher and student networks. The technique, demonstrated here for the case of adaptive input-to-hidden weights, becomes exact as the dimensionality of the input space increases.
Resumo:
We study the effect of two types of noise, data noise and model noise, in an on-line gradient-descent learning scenario for general two-layer student network with an arbitrary number of hidden units. Training examples are randomly drawn input vectors labeled by a two-layer teacher network with an arbitrary number of hidden units. Data is then corrupted by Gaussian noise affecting either the output or the model itself. We examine the effect of both types of noise on the evolution of order parameters and the generalization error in various phases of the learning process.
Resumo:
We complement recent advances in thermodynamic limit analyses of mean on-line gradient descent learning dynamics in multi-layer networks by calculating fluctuations possessed by finite dimensional systems. Fluctuations from the mean dynamics are largest at the onset of specialisation as student hidden unit weight vectors begin to imitate specific teacher vectors, increasing with the degree of symmetry of the initial conditions. In light of this, we include a term to stimulate asymmetry in the learning process, which typically also leads to a significant decrease in training time.
Resumo:
In the present study, multilayer perceptron (MLP) neural networks were applied to help in the diagnosis of obstructive sleep apnoea syndrome (OSAS). Oxygen saturation (SaO2) recordings from nocturnal pulse oximetry were used for this purpose. We performed time and spectral analysis of these signals to extract 14 features related to OSAS. The performance of two different MLP classifiers was compared: maximum likelihood (ML) and Bayesian (BY) MLP networks. A total of 187 subjects suspected of suffering from OSAS took part in the study. Their SaO2 signals were divided into a training set with 74 recordings and a test set with 113 recordings. BY-MLP networks achieved the best performance on the test set with 85.58% accuracy (87.76% sensitivity and 82.39% specificity). These results were substantially better than those provided by ML-MLP networks, which were affected by overfitting and achieved an accuracy of 76.81% (86.42% sensitivity and 62.83% specificity). Our results suggest that the Bayesian framework is preferred to implement our MLP classifiers. The proposed BY-MLP networks could be used for early OSAS detection. They could contribute to overcome the difficulties of nocturnal polysomnography (PSG) and thus reduce the demand for these studies.
Resumo:
We study the effect of regularization in an on-line gradient-descent learning scenario for a general two-layer student network with an arbitrary number of hidden units. Training examples are randomly drawn input vectors labelled by a two-layer teacher network with an arbitrary number of hidden units which may be corrupted by Gaussian output noise. We examine the effect of weight decay regularization on the dynamical evolution of the order parameters and generalization error in various phases of the learning process, in both noiseless and noisy scenarios.
Resumo:
We present a framework for calculating globally optimal parameters, within a given time frame, for on-line learning in multilayer neural networks. We demonstrate the capability of this method by computing optimal learning rates in typical learning scenarios. A similar treatment allows one to determine the relevance of related training algorithms based on modifications to the basic gradient descent rule as well as to compare different training methods.
Resumo:
A method for calculating the globally optimal learning rate in on-line gradient-descent training of multilayer neural networks is presented. The method is based on a variational approach which maximizes the decrease in generalization error over a given time frame. We demonstrate the method by computing optimal learning rates in typical learning scenarios. The method can also be employed when different learning rates are allowed for different parameter vectors as well as to determine the relevance of related training algorithms based on modifications to the basic gradient descent rule.
Resumo:
Providing transportation system operators and travelers with accurate travel time information allows them to make more informed decisions, yielding benefits for individual travelers and for the entire transportation system. Most existing advanced traveler information systems (ATIS) and advanced traffic management systems (ATMS) use instantaneous travel time values estimated based on the current measurements, assuming that traffic conditions remain constant in the near future. For more effective applications, it has been proposed that ATIS and ATMS should use travel times predicted for short-term future conditions rather than instantaneous travel times measured or estimated for current conditions. ^ This dissertation research investigates short-term freeway travel time prediction using Dynamic Neural Networks (DNN) based on traffic detector data collected by radar traffic detectors installed along a freeway corridor. DNN comprises a class of neural networks that are particularly suitable for predicting variables like travel time, but has not been adequately investigated for this purpose. Before this investigation, it was necessary to identifying methods for data imputation to account for missing data usually encountered when collecting data using traffic detectors. It was also necessary to identify a method to estimate the travel time on the freeway corridor based on data collected using point traffic detectors. A new travel time estimation method referred to as the Piecewise Constant Acceleration Based (PCAB) method was developed and compared with other methods reported in the literatures. The results show that one of the simple travel time estimation methods (the average speed method) can work as well as the PCAB method, and both of them out-perform other methods. This study also compared the travel time prediction performance of three different DNN topologies with different memory setups. The results show that one DNN topology (the time-delay neural networks) out-performs the other two DNN topologies for the investigated prediction problem. This topology also performs slightly better than the simple multilayer perceptron (MLP) neural network topology that has been used in a number of previous studies for travel time prediction.^
Resumo:
Providing transportation system operators and travelers with accurate travel time information allows them to make more informed decisions, yielding benefits for individual travelers and for the entire transportation system. Most existing advanced traveler information systems (ATIS) and advanced traffic management systems (ATMS) use instantaneous travel time values estimated based on the current measurements, assuming that traffic conditions remain constant in the near future. For more effective applications, it has been proposed that ATIS and ATMS should use travel times predicted for short-term future conditions rather than instantaneous travel times measured or estimated for current conditions. This dissertation research investigates short-term freeway travel time prediction using Dynamic Neural Networks (DNN) based on traffic detector data collected by radar traffic detectors installed along a freeway corridor. DNN comprises a class of neural networks that are particularly suitable for predicting variables like travel time, but has not been adequately investigated for this purpose. Before this investigation, it was necessary to identifying methods for data imputation to account for missing data usually encountered when collecting data using traffic detectors. It was also necessary to identify a method to estimate the travel time on the freeway corridor based on data collected using point traffic detectors. A new travel time estimation method referred to as the Piecewise Constant Acceleration Based (PCAB) method was developed and compared with other methods reported in the literatures. The results show that one of the simple travel time estimation methods (the average speed method) can work as well as the PCAB method, and both of them out-perform other methods. This study also compared the travel time prediction performance of three different DNN topologies with different memory setups. The results show that one DNN topology (the time-delay neural networks) out-performs the other two DNN topologies for the investigated prediction problem. This topology also performs slightly better than the simple multilayer perceptron (MLP) neural network topology that has been used in a number of previous studies for travel time prediction.
Resumo:
A dissertation submitted in fulfillment of the requirements to the degree of Master in Computer Science and Computer Engineering
Resumo:
Damage detection by measuring and analyzing vibration signals in a machine component is an established procedure in mechanical and aerospace engineering. This paper presents vibration signature analysis of steel bridge structures in a nonconventional way using artificial neural networks (ANN). Multilayer perceptrons have been adopted using the back-propagation algorithm for network training. The training patterns in terms of vibration signature are generated analytically for a moving load traveling on a trussed bridge structure at a constant speed to simulate the inspection vehicle. Using the finite-element technique, the moving forces are converted into stationary time-dependent force functions in order to generate vibration signals in the structure and the same is used to train the network. The performance of the trained networks is examined for their capability to detect damage from unknown signatures taken independently at one, three, and five nodes. It has been observed that the prediction using the trained network with single-node signature measurement at a suitability chosen location is even better than that of three-node and five-node measurement data.