31 resultados para STATISTICAL INFORMATION

em Aston University Research Archive


Relevância:

40.00% 40.00%

Publicador:

Resumo:

An unsupervised learning procedure based on maximizing the mutual information between the outputs of two networks receiving different but statistically dependent inputs is analyzed (Becker S. and Hinton G., Nature, 355 (1992) 161). By exploiting a formal analogy to supervised learning in parity machines, the theory of zero-temperature Gibbs learning for the unsupervised procedure is presented for the case that the networks are perceptrons and for the case of fully connected committees.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Information systems have developed to the stage that there is plenty of data available in most organisations but there are still major problems in turning that data into information for management decision making. This thesis argues that the link between decision support information and transaction processing data should be through a common object model which reflects the real world of the organisation and encompasses the artefacts of the information system. The CORD (Collections, Objects, Roles and Domains) model is developed which is richer in appropriate modelling abstractions than current Object Models. A flexible Object Prototyping tool based on a Semantic Data Storage Manager has been developed which enables a variety of models to be stored and experimented with. A statistical summary table model COST (Collections of Objects Statistical Table) has been developed within CORD and is shown to be adequate to meet the modelling needs of Decision Support and Executive Information Systems. The COST model is supported by a statistical table creator and editor COSTed which is also built on top of the Object Prototyper and uses the CORD model to manage its metadata.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The statistical distribution, when determined from an incomplete set of constraints, is shown to be suitable as host for encrypted information. We design an encoding/decoding scheme to embed such a distribution with hidden information. The encryption security is based on the extreme instability of the encoding procedure. The essential feature of the proposed system lies in the fact that the key for retrieving the code is generated by random perturbations of very small value. The security of the proposed encryption relies on the security to interchange the secret key. Hence, it appears as a good complement to the quantum key distribution protocol. © 2005 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Neural networks can be regarded as statistical models, and can be analysed in a Bayesian framework. Generalisation is measured by the performance on independent test data drawn from the same distribution as the training data. Such performance can be quantified by the posterior average of the information divergence between the true and the model distributions. Averaging over the Bayesian posterior guarantees internal coherence; Using information divergence guarantees invariance with respect to representation. The theory generalises the least mean squares theory for linear Gaussian models to general problems of statistical estimation. The main results are: (1)~the ideal optimal estimate is always given by average over the posterior; (2)~the optimal estimate within a computational model is given by the projection of the ideal estimate to the model. This incidentally shows some currently popular methods dealing with hyperpriors are in general unnecessary and misleading. The extension of information divergence to positive normalisable measures reveals a remarkable relation between the dlt dual affine geometry of statistical manifolds and the geometry of the dual pair of Banach spaces Ld and Ldd. It therefore offers conceptual simplification to information geometry. The general conclusion on the issue of evaluating neural network learning rules and other statistical inference methods is that such evaluations are only meaningful under three assumptions: The prior P(p), describing the environment of all the problems; the divergence Dd, specifying the requirement of the task; and the model Q, specifying available computing resources.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Neural networks are statistical models and learning rules are estimators. In this paper a theory for measuring generalisation is developed by combining Bayesian decision theory with information geometry. The performance of an estimator is measured by the information divergence between the true distribution and the estimate, averaged over the Bayesian posterior. This unifies the majority of error measures currently in use. The optimal estimators also reveal some intricate interrelationships among information geometry, Banach spaces and sufficient statistics.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We analyse the dynamics of a number of second order on-line learning algorithms training multi-layer neural networks, using the methods of statistical mechanics. We first consider on-line Newton's method, which is known to provide optimal asymptotic performance. We determine the asymptotic generalization error decay for a soft committee machine, which is shown to compare favourably with the result for standard gradient descent. Matrix momentum provides a practical approximation to this method by allowing an efficient inversion of the Hessian. We consider an idealized matrix momentum algorithm which requires access to the Hessian and find close correspondence with the dynamics of on-line Newton's method. In practice, the Hessian will not be known on-line and we therefore consider matrix momentum using a single example approximation to the Hessian. In this case good asymptotic performance may still be achieved, but the algorithm is now sensitive to parameter choice because of noise in the Hessian estimate. On-line Newton's method is not appropriate during the transient learning phase, since a suboptimal unstable fixed point of the gradient descent dynamics becomes stable for this algorithm. A principled alternative is to use Amari's natural gradient learning algorithm and we show how this method provides a significant reduction in learning time when compared to gradient descent, while retaining the asymptotic performance of on-line Newton's method.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we review recent theoretical approaches for analysing the dynamics of on-line learning in multilayer neural networks using methods adopted from statistical physics. The analysis is based on monitoring a set of macroscopic variables from which the generalisation error can be calculated. A closed set of dynamical equations for the macroscopic variables is derived analytically and solved numerically. The theoretical framework is then employed for defining optimal learning parameters and for analysing the incorporation of second order information into the learning process using natural gradient descent and matrix-momentum based methods. We will also briefly explain an extension of the original framework for analysing the case where training examples are sampled with repetition.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The performance of "typical set (pairs) decoding" for ensembles of Gallager's linear code is investigated using statistical physics. In this decoding method, errors occur, either when the information transmission is corrupted by atypical noise, or when multiple typical sequences satisfy the parity check equation as provided by the received corrupted codeword. We show that the average error rate for the second type of error over a given code ensemble can be accurately evaluated using the replica method, including the sensitivity to message length. Our approach generally improves the existing analysis known in the information theory community, which was recently reintroduced in IEEE Trans. Inf. Theory 45, 399 (1999), and is believed to be the most accurate to date. © 2002 The American Physical Society.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The modem digital communication systems are made transmission reliable by employing error correction technique for the redundancies. Codes in the low-density parity-check work along the principles of Hamming code, and the parity-check matrix is very sparse, and multiple errors can be corrected. The sparseness of the matrix allows for the decoding process to be carried out by probability propagation methods similar to those employed in Turbo codes. The relation between spin systems in statistical physics and digital error correcting codes is based on the existence of a simple isomorphism between the additive Boolean group and the multiplicative binary group. Shannon proved general results on the natural limits of compression and error-correction by setting up the framework known as information theory. Error-correction codes are based on mapping the original space of words onto a higher dimensional space in such a way that the typical distance between encoded words increases.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We review recent theoretical progress on the statistical mechanics of error correcting codes, focusing on low-density parity-check (LDPC) codes in general, and on Gallager and MacKay-Neal codes in particular. By exploiting the relation between LDPC codes and Ising spin systems with multispin interactions, one can carry out a statistical mechanics based analysis that determines the practical and theoretical limitations of various code constructions, corresponding to dynamical and thermodynamical transitions, respectively, as well as the behaviour of error-exponents averaged over the corresponding code ensemble as a function of channel noise. We also contrast the results obtained using methods of statistical mechanics with those derived in the information theory literature, and show how these methods can be generalized to include other channel types and related communication problems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A novel approach, based on statistical mechanics, to analyze typical performance of optimum code-division multiple-access (CDMA) multiuser detectors is reviewed. A `black-box' view ot the basic CDMA channel is introduced, based on which the CDMA multiuser detection problem is regarded as a `learning-from-examples' problem of the `binary linear perceptron' in the neural network literature. Adopting Bayes framework, analysis of the performance of the optimum CDMA multiuser detectors is reduced to evaluation of the average of the cumulant generating function of a relevant posterior distribution. The evaluation of the average cumulant generating function is done, based on formal analogy with a similar calculation appearing in the spin glass theory in statistical mechanics, by making use of the replica method, a method developed in the spin glass theory.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We propose a method based on the magnetization enumerator to determine the critical noise level for Gallager type low density parity check error correcting codes (LDPC). Our method provides an appealingly simple interpretation to the relation between different decoding schemes, and provides more optimistic critical noise levels than those reported in the information theory literature.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We analyze, using the replica method of statistical mechanics, the theoretical performance of coded code-division multiple-access (CDMA) systems in which regular low-density parity-check (LDPC) codes are used for channel coding.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We investigate the use of Gallager's low-density parity-check (LDPC) codes in a degraded broadcast channel, one of the fundamental models in network information theory. Combining linear codes is a standard technique in practical network communication schemes and is known to provide better performance than simple time sharing methods when algebraic codes are used. The statistical physics based analysis shows that the practical performance of the suggested method, achieved by employing the belief propagation algorithm, is superior to that of LDPC based time sharing codes while the best performance, when received transmissions are optimally decoded, is bounded by the time sharing limit.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Fourier-phase information is important in determining the appearance of natural scenes, but the structure of natural-image phase spectra is highly complex and difficult to relate directly to human perceptual processes. This problem is addressed by extending previous investigations of human visual sensitivity to the randomisation and quantisation of Fourier phase in natural images. The salience of the image changes induced by these physical processes is shown to depend critically on the nature of the original phase spectrum of each image, and the processes of randomisation and quantisation are shown to be perceptually equivalent provided that they shift image phase components by the same average amount. These results are explained by assuming that the visual system is sensitive to those phase-domain image changes which also alter certain global higher-order image statistics. This assumption may be used to place constraints on the likely nature of cortical processing: mechanisms which correlate the outputs of a bank of relative-phase-sensitive units are found to be consistent with the patterns of sensitivity reported here.