Biblioteca Digital

76 resultados para statistical mechanics many-body inverse problem graph-theory

Transients and asymptotics of natural gradient learning

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We analyse natural gradient learning in a two-layer feed-forward neural network using a statistical mechanics framework which is appropriate for large input dimension. We find significant improvement over standard gradient descent in both the transient and asymptotic phases of learning.

The dynamics of matrix momentum

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We analyse the matrix momentum algorithm, which provides an efficient approximation to on-line Newton's method, by extending a recent statistical mechanics framework to include second order algorithms. We study the efficacy of this method when the Hessian is available and also consider a practical implementation which uses a single example estimate of the Hessian. The method is shown to provide excellent asymptotic performance, although the single example implementation is sensitive to the choice of training parameters. We conjecture that matrix momentum could provide efficient matrix inversion for other second order algorithms.

Globally optimal on-line learning rules

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present a method for determining the globally optimal on-line learning rule for a soft committee machine under a statistical mechanics framework. This work complements previous results on locally optimal rules, where only the rate of change in generalization error was considered. We maximize the total reduction in generalization error over the whole learning process and show how the resulting rule can significantly outperform the locally optimal rule.

First year qualifying report: neural networks for extracting wind vectors from satellite scatterometer data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The ERS-1 Satellite was launched in July 1991 by the European Space Agency into a polar orbit at about km800, carrying a C-band scatterometer. A scatterometer measures the amount of radar back scatter generated by small ripples on the ocean surface induced by instantaneous local winds. Operational methods that extract wind vectors from satellite scatterometer data are based on the local inversion of a forward model, mapping scatterometer observations to wind vectors, by the minimisation of a cost function in the scatterometer measurement space.par This report uses mixture density networks, a principled method for modelling conditional probability density functions, to model the joint probability distribution of the wind vectors given the satellite scatterometer measurements in a single cell (the `inverse' problem). The complexity of the mapping and the structure of the conditional probability density function are investigated by varying the number of units in the hidden layer of the multi-layer perceptron and the number of kernels in the Gaussian mixture model of the mixture density network respectively. The optimal model for networks trained per trace has twenty hidden units and four kernels. Further investigation shows that models trained with incidence angle as an input have results comparable to those models trained by trace. A hybrid mixture density network that incorporates geophysical knowledge of the problem confirms other results that the conditional probability distribution is dominantly bimodal.par The wind retrieval results improve on previous work at Aston, but do not match other neural network techniques that use spatial information in the inputs, which is to be expected given the ambiguity of the inverse problem. Current work uses the local inverse model for autonomous ambiguity removal in a principled Bayesian framework. Future directions in which these models may be improved are given.

On-line learning of unrealizable tasks

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The dynamics of on-line learning is investigated for structurally unrealizable tasks in the context of two-layer neural networks with an arbitrary number of hidden neurons. Within a statistical mechanics framework, a closed set of differential equations describing the learning dynamics can be derived, for the general case of unrealizable isotropic tasks. In the asymptotic regime one can solve the dynamics analytically in the limit of large number of hidden neurons, providing an analytical expression for the residual generalization error, the optimal and critical asymptotic training parameters, and the corresponding prefactor of the generalization error decay.

Natural gradient matrix momentum

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Natural gradient learning is an efficient and principled method for improving on-line learning. In practical applications there will be an increased cost required in estimating and inverting the Fisher information matrix. We propose to use the matrix momentum algorithm in order to carry out efficient inversion and study the efficacy of a single step estimation of the Fisher information matrix. We analyse the proposed algorithm in a two-layer network, using a statistical mechanics framework which allows us to describe analytically the learning dynamics, and compare performance with true natural gradient learning and standard gradient descent.

Mean field methods for classification with Gaussian processes

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We discuss the Application of TAP mean field methods known from Statistical Mechanics of disordered systems to Bayesian classification with Gaussian processes. In contrast to previous applications, no knowledge about the distribution of inputs is needed. Simulation results for the Sonar data set are given.

Advances in large margin classifiers

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We apply methods of Statistical Mechanics to study the generalization performance of Support vector Machines in large data spaces.

Learning curves for Gaussian processes models: fluctuations and universality

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Based on a statistical mechanics approach, we develop a method for approximately computing average case learning curves and their sample fluctuations for Gaussian process regression models. We give examples for the Wiener process and show that universal relations (that are independent of the input distribution) between error measures can be derived.

Typical performance of low-density parity-check codes over general symmetric channels

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Typical performance of low-density parity-check (LDPC) codes over a general binary-input output-symmetric memoryless channel is investigated using methods of statistical mechanics. The binary-input additive-white-Gaussian-noise channel and the binary-input Laplace channel are considered as specific channel noise models.

Probability distribution modelling to improve stability in nonlinear MIMO control

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We consider the direct adaptive inverse control of nonlinear multivariable systems with different delays between every input-output pair. In direct adaptive inverse control, the inverse mapping is learned from examples of input-output pairs. This makes the obtained controller sub optimal, since the network may have to learn the response of the plant over a larger operational range than necessary. Moreover, in certain applications, the control problem can be redundant, implying that the inverse problem is ill posed. In this paper we propose a new algorithm which allows estimating and exploiting uncertainty in nonlinear multivariable control systems. This approach allows us to model strongly non-Gaussian distribution of control signals as well as processes with hysteresis. The proposed algorithm circumvents the dynamic programming problem by using the predicted neural network uncertainty to localise the possible control solutions to consider.

The relationship between self-awareness of attentional status, behavioral performance and oscillatory brain rhythms

Relevância:

100.00% 100.00%

Publicador:

Resumo:

High-level cognitive factors, including self-awareness, are believed to play an important role in human visual perception. The principal aim of this study was to determine whether oscillatory brain rhythms play a role in the neural processes involved in self-monitoring attentional status. To do so we measured cortical activity using magnetoencephalography (MEG) and functional magnetic resonance imaging (fMRI) while participants were asked to self-monitor their internal status, only initiating the presentation of a stimulus when they perceived their attentional focus to be maximal. We employed a hierarchical Bayesian method that uses fMRI results as soft-constrained spatial information to solve the MEG inverse problem, allowing us to estimate cortical currents in the order of millimeters and milliseconds. Our results show that, during self-monitoring of internal status, there was a sustained decrease in power within the 7-13 Hz (alpha) range in the rostral cingulate motor area (rCMA) on the human medial wall, beginning approximately 430 msec after the trial start (p < 0.05, FDR corrected). We also show that gamma-band power (41-47 Hz) within this area was positively correlated with task performance from 40-640 msec after the trial start (r = 0.71, p < 0.05). We conclude: (1) the rCMA is involved in processes governing self-monitoring of internal status; and (2) the qualitative differences between alpha and gamma activity are reflective of their different roles in self-monitoring internal states. We suggest that alpha suppression may reflect a strengthening of top-down interareal connections, while a positive correlation between gamma activity and task performance indicates that gamma may play an important role in guiding visuomotor behavior. © 2013 Yamagishi et al.

The topographic distribution of the magnetic P100M to full- and half-field stimulation

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Visual evoked magnetic responses were recorded to full-field and left and right half-field stimulation with three check sizes (70′, 34′ and 22′) in five normal subjects. Recordings were made sequentially on a 20-position grid (4 × 5) based on the inion, by means of a single-channel direct current-Superconducting Quantum Interference Device second-order gradiometer. The topographic maps were consistent on the same subjects recorded 2 months apart. The half-field responses produced the strongest signals in the contralateral hemisphere and were consistent with the cruciform model of the calcarine fissure. Right half fields produced upper-left-quadrant outgoing fields and lower-left-quadrant ingoing fields, while the left half field produced the opposite response. The topographic maps also varied with check size, with the larger checks producing positive or negative maximum position more anteriorly than small checks. In addition, with large checks the full-field responses could be explained as the summation of the two half fields, whereas full-field responses to smaller checks were more unpredictable and may be due to sources located at the occipital pole or lateral surface. In addition, dipole sources were located as appropriate with the use of inverse problem solutions. Topographic data will be vital to the clinical use of the visual evoked field but, in addition, provides complementary information to visual evoked potentials, allowing detailed studies of the visual cortex. © 1992 Kluwer Academic Publishers.

Typical behavior of relays in communication channels

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The typical behavior of the relay-without-delay channel under low-density parity-check coding and its multiple-unit generalization, termed the relay array, is studied using methods of statistical mechanics. A demodulate-and- forward strategy is analytically solved using the replica symmetric ansatz which is exact in the system studied at Nishimori's temperature. In particular, the typical level of improvement in communication performance by relaying messages is shown in the case of a small and a large number of relay units. © 2007 The American Physical Society.

Properties of sparse random matrices over finite fields

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Typical properties of sparse random matrices over finite (Galois) fields are studied, in the limit of large matrices, using techniques from the physics of disordered systems. For the case of a finite field GF(q) with prime order q, we present results for the average kernel dimension, average dimension of the eigenvector spaces and the distribution of the eigenvalues. The number of matrices for a given distribution of entries is also calculated for the general case. The significance of these results to error-correcting codes and random graphs is also discussed.

«
1
2
3
4
5
6
»