28 resultados para Multi layer perceptron
em Aston University Research Archive
Resumo:
We present a method for determining the globally optimal on-line learning rule for a soft committee machine under a statistical mechanics framework. This rule maximizes the total reduction in generalization error over the whole learning process. A simple example demonstrates that the locally optimal rule, which maximizes the rate of decrease in generalization error, may perform poorly in comparison.
Resumo:
In this paper we propose a novel type of multiple-layer photomixer based on amorphous/nano-crystalline-Si. Such a device implies that it could be possible to enhance the conversion efficiency from optical power to THz emission by increasing the absorption length and by reducing the device overheating through the use of substrates with higher thermal conductivity compared to GaAs. Our calculations show that the output power from a two-layer Si-based photomixer is at least ten times higher than that from conventional LT-GaAs photomixers at 1 THz.
Resumo:
Novel surface plasmonic optical fiber sensors have been fabricated using multiple coatings deposited on a lapped section of a single mode fiber. UV laser irradiation processing with a phase mask produces a nano-scaled surface relief grating structure resembling nano-wires. The resulting individual corrugations produced by material compaction are approximately 20 μm long with an average width at half maximum of 100 nm and generate localized surface plasmons. Experimental data are presented that show changes in the spectral characteristics after UV processing, coupled with an overall increase in the sensitivity of the devices to surrounding refractive index. Evidence is presented that there is an optimum UV dosage (48 joules) over which no significant additional optical change is observed. The devices are characterized with regards to change in refractive index, where significantly high spectral sensitivities in the aqueous index regime are found, ranging up to 4000 nm/RIU for wavelength and 800 dB/RIU for intensity. © 2013 Optical Society of America.
Resumo:
The popularity of online social media platforms provides an unprecedented opportunity to study real-world complex networks of interactions. However, releasing this data to researchers and the public comes at the cost of potentially exposing private and sensitive user information. It has been shown that a naive anonymization of a network by removing the identity of the nodes is not sufficient to preserve users’ privacy. In order to deal with malicious attacks, k -anonymity solutions have been proposed to partially obfuscate topological information that can be used to infer nodes’ identity. In this paper, we study the problem of ensuring k anonymity in time-varying graphs, i.e., graphs with a structure that changes over time, and multi-layer graphs, i.e., graphs with multiple types of links. More specifically, we examine the case in which the attacker has access to the degree of the nodes. The goal is to generate a new graph where, given the degree of a node in each (temporal) layer of the graph, such a node remains indistinguishable from other k-1 nodes in the graph. In order to achieve this, we find the optimal partitioning of the graph nodes such that the cost of anonymizing the degree information within each group is minimum. We show that this reduces to a special case of a Generalized Assignment Problem, and we propose a simple yet effective algorithm to solve it. Finally, we introduce an iterated linear programming approach to enforce the realizability of the anonymized degree sequences. The efficacy of the method is assessed through an extensive set of experiments on synthetic and real-world graphs.
Resumo:
This paper presents results from the first use of neural networks for the real-time feedback control of high temperature plasmas in a Tokamak fusion experiment. The Tokamak is currently the principal experimental device for research into the magnetic confinement approach to controlled fusion. In the Tokamak, hydrogen plasmas, at temperatures of up to 100 Million K, are confined by strong magnetic fields. Accurate control of the position and shape of the plasma boundary requires real-time feedback control of the magnetic field structure on a time-scale of a few tens of microseconds. Software simulations have demonstrated that a neural network approach can give significantly better performance than the linear technique currently used on most Tokamak experiments. The practical application of the neural network approach requires high-speed hardware, for which a fully parallel implementation of the multi-layer perceptron, using a hybrid of digital and analogue technology, has been developed.
Resumo:
This paper presents a forecasting technique for forward energy prices, one day ahead. This technique combines a wavelet transform and forecasting models such as multi- layer perceptron, linear regression or GARCH. These techniques are applied to real data from the UK gas markets to evaluate their performance. The results show that the forecasting accuracy is improved significantly by using the wavelet transform. The methodology can be also applied to forecasting market clearing prices and electricity/gas loads.
Resumo:
It is well known that one of the obstacles to effective forecasting of exchange rates is heteroscedasticity (non-stationary conditional variance). The autoregressive conditional heteroscedastic (ARCH) model and its variants have been used to estimate a time dependent variance for many financial time series. However, such models are essentially linear in form and we can ask whether a non-linear model for variance can improve results just as non-linear models (such as neural networks) for the mean have done. In this paper we consider two neural network models for variance estimation. Mixture Density Networks (Bishop 1994, Nix and Weigend 1994) combine a Multi-Layer Perceptron (MLP) and a mixture model to estimate the conditional data density. They are trained using a maximum likelihood approach. However, it is known that maximum likelihood estimates are biased and lead to a systematic under-estimate of variance. More recently, a Bayesian approach to parameter estimation has been developed (Bishop and Qazaz 1996) that shows promise in removing the maximum likelihood bias. However, up to now, this model has not been used for time series prediction. Here we compare these algorithms with two other models to provide benchmark results: a linear model (from the ARIMA family), and a conventional neural network trained with a sum-of-squares error function (which estimates the conditional mean of the time series with a constant variance noise model). This comparison is carried out on daily exchange rate data for five currencies.
Resumo:
Obtaining wind vectors over the ocean is important for weather forecasting and ocean modelling. Several satellite systems used operationally by meteorological agencies utilise scatterometers to infer wind vectors over the oceans. In this paper we present the results of using novel neural network based techniques to estimate wind vectors from such data. The problem is partitioned into estimating wind speed and wind direction. Wind speed is modelled using a multi-layer perceptron (MLP) and a sum of squares error function. Wind direction is a periodic variable and a multi-valued function for a given set of inputs; a conventional MLP fails at this task, and so we model the full periodic probability density of direction conditioned on the satellite derived inputs using a Mixture Density Network (MDN) with periodic kernel functions. A committee of the resulting MDNs is shown to improve the results.
Resumo:
The ERS-1 Satellite was launched in July 1991 by the European Space Agency into a polar orbit at about km800, carrying a C-band scatterometer. A scatterometer measures the amount of radar back scatter generated by small ripples on the ocean surface induced by instantaneous local winds. Operational methods that extract wind vectors from satellite scatterometer data are based on the local inversion of a forward model, mapping scatterometer observations to wind vectors, by the minimisation of a cost function in the scatterometer measurement space.par This report uses mixture density networks, a principled method for modelling conditional probability density functions, to model the joint probability distribution of the wind vectors given the satellite scatterometer measurements in a single cell (the `inverse' problem). The complexity of the mapping and the structure of the conditional probability density function are investigated by varying the number of units in the hidden layer of the multi-layer perceptron and the number of kernels in the Gaussian mixture model of the mixture density network respectively. The optimal model for networks trained per trace has twenty hidden units and four kernels. Further investigation shows that models trained with incidence angle as an input have results comparable to those models trained by trace. A hybrid mixture density network that incorporates geophysical knowledge of the problem confirms other results that the conditional probability distribution is dominantly bimodal.par The wind retrieval results improve on previous work at Aston, but do not match other neural network techniques that use spatial information in the inputs, which is to be expected given the ambiguity of the inverse problem. Current work uses the local inverse model for autonomous ambiguity removal in a principled Bayesian framework. Future directions in which these models may be improved are given.
Resumo:
Obtaining wind vectors over the ocean is important for weather forecasting and ocean modelling. Several satellite systems used operationally by meteorological agencies utilise scatterometers to infer wind vectors over the oceans. In this paper we present the results of using novel neural network based techniques to estimate wind vectors from such data. The problem is partitioned into estimating wind speed and wind direction. Wind speed is modelled using a multi-layer perceptron (MLP) and a sum of squares error function. Wind direction is a periodic variable and a multi-valued function for a given set of inputs; a conventional MLP fails at this task, and so we model the full periodic probability density of direction conditioned on the satellite derived inputs using a Mixture Density Network (MDN) with periodic kernel functions. A committee of the resulting MDNs is shown to improve the results.
Resumo:
This paper presents some forecasting techniques for energy demand and price prediction, one day ahead. These techniques combine wavelet transform (WT) with fixed and adaptive machine learning/time series models (multi-layer perceptron (MLP), radial basis functions, linear regression, or GARCH). To create an adaptive model, we use an extended Kalman filter or particle filter to update the parameters continuously on the test set. The adaptive GARCH model is a new contribution, broadening the applicability of GARCH methods. We empirically compared two approaches of combining the WT with prediction models: multicomponent forecasts and direct forecasts. These techniques are applied to large sets of real data (both stationary and non-stationary) from the UK energy markets, so as to provide comparative results that are statistically stronger than those previously reported. The results showed that the forecasting accuracy is significantly improved by using the WT and adaptive models. The best models on the electricity demand/gas price forecast are the adaptive MLP/GARCH with the multicomponent forecast; their MSEs are 0.02314 and 0.15384 respectively.
Resumo:
A number of researchers have investigated the impact of network architecture on the performance of artificial neural networks. Particular attention has been paid to the impact on the performance of the multi-layer perceptron of architectural issues, and the use of various strategies to attain an optimal network structure. However, there are still perceived limitations with the multi-layer perceptron and networks that employ a different architecture to the multi-layer perceptron have gained in popularity in recent years, particularly, networks that implement a more localised solution, where the solution in one area of the problem space does not impact, or has a minimal impact, on other areas of the space. In this study, we discuss the major architectural issues affecting the performance of a multi-layer perceptron, before moving on to examine in detail the performance of a new localised network, namely the bumptree. The work presented here examines the impact on the performance of artificial neural networks of employing alternative networks to the long established multi-layer perceptron. In particular, networks that impose a solution where the impact of each parameter in the final network architecture has a localised impact on the problem space being modelled are examined. The alternatives examined are the radial basis function and bumptree neural networks, and the impact of architectural issues on the performance of these networks is examined. Particular attention is paid to the bumptree, with new techniques for both developing the bumptree structure and employing this structure to classify patterns being examined.
Resumo:
In this letter, we derive continuum equations for the generalization error of the Bayesian online algorithm (BOnA) for the one-layer perceptron with a spherical covariance matrix using the Rosenblatt potential and show, by numerical calculations, that the asymptotic performance of the algorithm is the same as the one for the optimal algorithm found by means of variational methods with the added advantage that the BOnA does not use any inaccessible information during learning. © 2007 IEEE.
Resumo:
We complement recent advances in thermodynamic limit analyses of mean on-line gradient descent learning dynamics in multi-layer networks by calculating fluctuations possessed by finite dimensional systems. Fluctuations from the mean dynamics are largest at the onset of specialisation as student hidden unit weight vectors begin to imitate specific teacher vectors, increasing with the degree of symmetry of the initial conditions. In light of this, we include a term to stimulate asymmetry in the learning process, which typically also leads to a significant decrease in training time.
Resumo:
Radial Basis Function networks with linear outputs are often used in regression problems because they can be substantially faster to train than Multi-layer Perceptrons. For classification problems, the use of linear outputs is less appropriate as the outputs are not guaranteed to represent probabilities. We show how RBFs with logistic and softmax outputs can be trained efficiently using the Fisher scoring algorithm. This approach can be used with any model which consists of a generalised linear output function applied to a model which is linear in its parameters. We compare this approach with standard non-linear optimisation algorithms on a number of datasets.