61 resultados para Information display systems.
Resumo:
For neural networks with a wide class of weight-priors, it can be shown that in the limit of an infinite number of hidden units the prior over functions tends to a Gaussian process. In this paper analytic forms are derived for the covariance function of the Gaussian processes corresponding to networks with sigmoidal and Gaussian hidden units. This allows predictions to be made efficiently using networks with an infinite number of hidden units, and shows that, somewhat paradoxically, it may be easier to compute with infinite networks than finite ones.
Resumo:
Gaussian processes provide natural non-parametric prior distributions over regression functions. In this paper we consider regression problems where there is noise on the output, and the variance of the noise depends on the inputs. If we assume that the noise is a smooth function of the inputs, then it is natural to model the noise variance using a second Gaussian process, in addition to the Gaussian process governing the noise-free output value. We show that prior uncertainty about the parameters controlling both processes can be handled and that the posterior distribution of the noise rate can be sampled from using Markov chain Monte Carlo methods. Our results on a synthetic data set give a posterior noise variance that well-approximates the true variance.
Resumo:
In most treatments of the regression problem it is assumed that the distribution of target data can be described by a deterministic function of the inputs, together with additive Gaussian noise having constant variance. The use of maximum likelihood to train such models then corresponds to the minimization of a sum-of-squares error function. In many applications a more realistic model would allow the noise variance itself to depend on the input variables. However, the use of maximum likelihood to train such models would give highly biased results. In this paper we show how a Bayesian treatment can allow for an input-dependent variance while overcoming the bias of maximum likelihood.
Resumo:
We present a method for determining the globally optimal on-line learning rule for a soft committee machine under a statistical mechanics framework. This work complements previous results on locally optimal rules, where only the rate of change in generalization error was considered. We maximize the total reduction in generalization error over the whole learning process and show how the resulting rule can significantly outperform the locally optimal rule.
Resumo:
Gaussian Processes provide good prior models for spatial data, but can be too smooth. In many physical situations there are discontinuities along bounding surfaces, for example fronts in near-surface wind fields. We describe a modelling method for such a constrained discontinuity and demonstrate how to infer the model parameters in wind fields with MCMC sampling.
Resumo:
We show the similarity between belief propagation and TAP, for decoding corrupted messages encoded by Sourlas's method. The latter is a special case of the Gallager error- correcting code, where the code word comprises products of K bits selected randomly from the original message. We examine the efficacy of solutions obtained by the two methods for various values of K and show that solutions for K>=3 may be sensitive to the choice of initial conditions in the case of unbiased patterns. Good approximations are obtained generally for K=2 and for biased patterns in the case of K>=3, especially when Nishimori's temperature is being used.
Resumo:
We develop an approach for a sparse representation for Gaussian Process (GP) models in order to overcome the limitations of GPs caused by large data sets. The method is based on a combination of a Bayesian online algorithm together with a sequential construction of a relevant subsample of the data which fully specifies the prediction of the model. Experimental results on toy examples and large real-world datasets indicate the efficiency of the approach.
Resumo:
Based on a simple convexity lemma, we develop bounds for different types of Bayesian prediction errors for regression with Gaussian processes. The basic bounds are formulated for a fixed training set. Simpler expressions are obtained for sampling from an input distribution which equals the weight function of the covariance kernel, yielding asymptotically tight results. The results are compared with numerical experiments.
Resumo:
We discuss the Application of TAP mean field methods known from Statistical Mechanics of disordered systems to Bayesian classification with Gaussian processes. In contrast to previous applications, no knowledge about the distribution of inputs is needed. Simulation results for the Sonar data set are given.
Resumo:
We analyse Gallager codes by employing a simple mean-field approximation that distorts the model geometry and preserves important interactions between sites. The method naturally recovers the probability propagation decoding algorithm as a minimization of a proper free-energy. We find a thermodynamical phase transition that coincides with information theoretical upper-bounds and explain the practical code performance in terms of the free-energy landscape.
Resumo:
We combine the replica approach from statistical physics with a variational approach to analyze learning curves analytically. We apply the method to Gaussian process regression. As a main result we derive approximative relations between empirical error measures, the generalization error and the posterior variance.
Resumo:
The problem of resource allocation in sparse graphs with real variables is studied using methods of statistical physics. An efficient distributed algorithm is devised on the basis of insight gained from the analysis and is examined using numerical simulations, showing excellent performance and full agreement with the theoretical results.
Resumo:
We consider the problem of illusory or artefactual structure from the visualisation of high-dimensional structureless data. In particular we examine the role of the distance metric in the use of topographic mappings based on the statistical field of multidimensional scaling. We show that the use of a squared Euclidean metric (i.e. the SSTRESs measure) gives rise to an annular structure when the input data is drawn from a high-dimensional isotropic distribution, and we provide a theoretical justification for this observation.
Resumo:
Spread spectrum systems make use of radio frequency bandwidths which far exceed the minimum bandwidth necessary to transmit the basic message information.These systems are designed to provide satisfactory communication of the message information under difficult transmission conditions. Frequency-hopped multilevel frequency shift keying (FH-MFSK) is one of the many techniques used in spread spectrum systems. It is a combination of frequency hopping and time hopping. In this system many users share a common frequency band using code division multiplexing. Each user is assigned an address and the message is modulated into the address. The receiver, knowing the address, decodes the received signal and extracts the message. This technique is suggested for digital mobile telephony. This thesis is concerned with an investigation of the possibility of utilising FH-MFSK for data transmission corrupted by additive white gaussian noise (A.W.G.N.). Work related to FH-MFSK has so far been mostly confined to its validity, and its performance in the presence of A.W.G.N. has not been reported before. An experimental system was therefore constructed which utilised combined hardware and software and operated under the supervision of a microprocessor system. The experimental system was used to develop an error-rate model for the system under investigation. The performance of FH-MFSK for data transmission was established in the presence of A.W.G.N. and with deleted and delayed sample effects. Its capability for multiuser applications was determined theoretically. The results show that FH-MFSK is a suitable technique for data transmission in the presence of A.W.G.N.
Resumo:
Diffusion processes are a family of continuous-time continuous-state stochastic processes that are in general only partially observed. The joint estimation of the forcing parameters and the system noise (volatility) in these dynamical systems is a crucial, but non-trivial task, especially when the system is nonlinear and multimodal. We propose a variational treatment of diffusion processes, which allows us to compute type II maximum likelihood estimates of the parameters by simple gradient techniques and which is computationally less demanding than most MCMC approaches. We also show how a cheap estimate of the posterior over the parameters can be constructed based on the variational free energy.