981 resultados para Eskander, Saad


Relevância:

10.00% 10.00%

Publicador:

Resumo:

We investigate the performance of error-correcting codes, where the code word comprises products of K bits selected from the original message and decoding is carried out utilizing a connectivity tensor with C connections per index. Shannon's bound for the channel capacity is recovered for large K and zero temperature when the code rate K/C is finite. Close to optimal error-correcting capability is obtained for finite K and C. We examine the finite-temperature case to assess the use of simulated annealing for decoding and extend the analysis to accommodate other types of noisy channels.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We analyse the dynamics of a number of second order on-line learning algorithms training multi-layer neural networks, using the methods of statistical mechanics. We first consider on-line Newton's method, which is known to provide optimal asymptotic performance. We determine the asymptotic generalization error decay for a soft committee machine, which is shown to compare favourably with the result for standard gradient descent. Matrix momentum provides a practical approximation to this method by allowing an efficient inversion of the Hessian. We consider an idealized matrix momentum algorithm which requires access to the Hessian and find close correspondence with the dynamics of on-line Newton's method. In practice, the Hessian will not be known on-line and we therefore consider matrix momentum using a single example approximation to the Hessian. In this case good asymptotic performance may still be achieved, but the algorithm is now sensitive to parameter choice because of noise in the Hessian estimate. On-line Newton's method is not appropriate during the transient learning phase, since a suboptimal unstable fixed point of the gradient descent dynamics becomes stable for this algorithm. A principled alternative is to use Amari's natural gradient learning algorithm and we show how this method provides a significant reduction in learning time when compared to gradient descent, while retaining the asymptotic performance of on-line Newton's method.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We analyse natural gradient learning in a two-layer feed-forward neural network using a statistical mechanics framework which is appropriate for large input dimension. We find significant improvement over standard gradient descent in both the transient and asymptotic phases of learning.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We analyse the matrix momentum algorithm, which provides an efficient approximation to on-line Newton's method, by extending a recent statistical mechanics framework to include second order algorithms. We study the efficacy of this method when the Hessian is available and also consider a practical implementation which uses a single example estimate of the Hessian. The method is shown to provide excellent asymptotic performance, although the single example implementation is sensitive to the choice of training parameters. We conjecture that matrix momentum could provide efficient matrix inversion for other second order algorithms.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We present a method for determining the globally optimal on-line learning rule for a soft committee machine under a statistical mechanics framework. This work complements previous results on locally optimal rules, where only the rate of change in generalization error was considered. We maximize the total reduction in generalization error over the whole learning process and show how the resulting rule can significantly outperform the locally optimal rule.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We investigate the performance of parity check codes using the mapping onto spin glasses proposed by Sourlas. We study codes where each parity check comprises products of K bits selected from the original digital message with exactly C parity checks per message bit. We show, using the replica method, that these codes saturate Shannon's coding bound for K?8 when the code rate K/C is finite. We then examine the finite temperature case to asses the use of simulated annealing methods for decoding, study the performance of the finite K case and extend the analysis to accommodate different types of noisy channels. The analogy between statistical physics methods and decoding by belief propagation is also discussed.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The dynamics of on-line learning is investigated for structurally unrealizable tasks in the context of two-layer neural networks with an arbitrary number of hidden neurons. Within a statistical mechanics framework, a closed set of differential equations describing the learning dynamics can be derived, for the general case of unrealizable isotropic tasks. In the asymptotic regime one can solve the dynamics analytically in the limit of large number of hidden neurons, providing an analytical expression for the residual generalization error, the optimal and critical asymptotic training parameters, and the corresponding prefactor of the generalization error decay.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We show the similarity between belief propagation and TAP, for decoding corrupted messages encoded by Sourlas's method. The latter is a special case of the Gallager error- correcting code, where the code word comprises products of K bits selected randomly from the original message. We examine the efficacy of solutions obtained by the two methods for various values of K and show that solutions for K>=3 may be sensitive to the choice of initial conditions in the case of unbiased patterns. Good approximations are obtained generally for K=2 and for biased patterns in the case of K>=3, especially when Nishimori's temperature is being used.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We investigate the performance of Gallager type error- correcting codes for Binary Symmetric Channels, where the code word comprises products of K bits selected from the original message and decoding is carried out utilizing a connectivity tensor with C connections per index. Shannon's bound for the channel capacity is recovered for large K and zero temperature when the code rate K/C is finite. Close to optimal error-correcting capability, with improved decoding properties is obtained for finite K and C.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Natural gradient learning is an efficient and principled method for improving on-line learning. In practical applications there will be an increased cost required in estimating and inverting the Fisher information matrix. We propose to use the matrix momentum algorithm in order to carry out efficient inversion and study the efficacy of a single step estimation of the Fisher information matrix. We analyse the proposed algorithm in a two-layer network, using a statistical mechanics framework which allows us to describe analytically the learning dynamics, and compare performance with true natural gradient learning and standard gradient descent.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A major problem in modern probabilistic modeling is the huge computational complexity involved in typical calculations with multivariate probability distributions when the number of random variables is large. Because exact computations are infeasible in such cases and Monte Carlo sampling techniques may reach their limits, there is a need for methods that allow for efficient approximate computations. One of the simplest approximations is based on the mean field method, which has a long history in statistical physics. The method is widely used, particularly in the growing field of graphical models. Researchers from disciplines such as statistical physics, computer science, and mathematical statistics are studying ways to improve this and related methods and are exploring novel application areas. Leading approaches include the variational approach, which goes beyond factorizable distributions to achieve systematic improvements; the TAP (Thouless-Anderson-Palmer) approach, which incorporates correlations by including effective reaction terms in the mean field theory; and the more general methods of graphical models. Bringing together ideas and techniques from these diverse disciplines, this book covers the theoretical foundations of advanced mean field methods, explores the relation between the different approaches, examines the quality of the approximation obtained, and demonstrates their application to various areas of probabilistic modeling.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Gallager-type error-correcting codes that nearly saturate Shannon's bound are constructed using insight gained from mapping the problem onto that of an Ising spin system. The performance of the suggested codes is evaluated for different code rates in both finite and infinite message length.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The efficacy of a specially constructed Gallager-type error-correcting code to communication in a Gaussian channel is examined. The construction is based on the introduction of complex matrices, used in both encoding and decoding, which comprise sub-matrices of cascading connection values. The finite-size effects are estimated for comparing the results with the bounds set by Shannon. The critical noise level achieved for certain code rates and infinitely large systems nearly saturates the bounds set by Shannon even when the connectivity used is low.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The performance of Gallager's error-correcting code is investigated via methods of statistical physics. In this method, the transmitted codeword comprises products of the original message bits selected by two randomly-constructed sparse matrices; the number of non-zero row/column elements in these matrices constitutes a family of codes. We show that Shannon's channel capacity is saturated for many of the codes while slightly lower performance is obtained for others which may be of higher practical relevance. Decoding aspects are considered by employing the TAP approach which is identical to the commonly used belief-propagation-based decoding.