76 resultados para statistical mechanics many-body inverse problem graph-theory
Resumo:
The problem of learning by examples in ultrametric committee machines (UCMs) is studied within the framework of statistical mechanics. Using the replica formalism we calculate the average generalization error in UCMs with L hidden layers and for a large enough number of units. In most of the regimes studied we find that the generalization error, as a function of the number of examples presented, develops a discontinuous drop at a critical value of the load parameter. We also find that when L>1 a number of teacher networks with the same number of hidden layers and different overlaps induce learning processes with the same critical points.
Resumo:
We investigate a simplified model of two fully connected magnetic systems maintained at different temperatures by virtue of being connected to two independent thermal baths while simultaneously being interconnected with each other. Using generating functional analysis, commonly used in statistical mechanics, we find exactly soluble expressions for their individual magnetization that define a two-dimensional nonlinear map, the equations of which have the same form as those obtained for densely connected equilibrium systems. Steady states correspond to the fixed points of this map, separating the parameter space into a rich set of nonequilibrium phases that we analyze in asymptotically high and low (nonequilibrium) temperature limits. The theoretical formalism is shown to revert to the classical nonequilibrium steady state problem for two interacting systems with a nonzero heat transfer between them that catalyzes a phase transition between ambient nonequilibrium states. © 2013 American Physical Society.
On the numerical solution of a Cauchy problem in an elastostatic half-plane with a bounded inclusion
Resumo:
We propose an iterative procedure for the inverse problem of determining the displacement vector on the boundary of a bounded planar inclusion given the displacement and stress fields on an infinite (planar) line-segment. At each iteration step mixed boundary value problems in an elastostatic half-plane containing the bounded inclusion are solved. For efficient numerical implementation of the procedure these mixed problems are reduced to integral equations over the bounded inclusion. Well-posedness and numerical solution of these boundary integral equations are presented, and a proof of convergence of the procedure for the inverse problem to the original solution is given. Numerical investigations are presented both for the direct and inverse problems, and these results show in particular that the displacement vector on the boundary of the inclusion can be found in an accurate and stable way with small computational cost.
Resumo:
The inference and optimization in sparse graphs with real variables is studied using methods of statistical mechanics. Efficient distributed algorithms for the resource allocation problem are devised. Numerical simulations show excellent performance and full agreement with the theoretical results. © Springer-Verlag Berlin Heidelberg 2006.
Resumo:
Typical performance of low-density parity-check (LDPC) codes over a general binary-input output-symmetric memoryless channel is investigated using methods of statistical mechanics. Relationship between the free energy in statistical-mechanics approach and the mutual information used in the information-theory literature is established within a general framework; Gallager and MacKay-Neal codes are studied as specific examples of LDPC codes. It is shown that basic properties of these codes known for particular channels, including their potential to saturate Shannon's bound, hold for general symmetric channels. The binary-input additive-white-Gaussian-noise channel and the binary-input Laplace channel are considered as specific channel models.
Resumo:
Advances in statistical physics relating to our understanding of large-scale complex systems have recently been successfully applied in the context of communication networks. Statistical mechanics methods can be used to decompose global system behavior into simple local interactions. Thus, large-scale problems can be solved or approximated in a distributed manner with iterative lightweight local messaging. This survey discusses how statistical physics methodology can provide efficient solutions to hard network problems that are intractable by classical methods. We highlight three typical examples in the realm of networking and communications. In each case we show how a fundamental idea of statistical physics helps solve the problem in an efficient manner. In particular, we discuss how to perform multicast scheduling with message passing methods, how to improve coding using the crystallization process, and how to compute optimal routing by representing routes as interacting polymers.
Resumo:
The problem of computing the storage capacity of a feed-forward network, with L hidden layers, N inputs, and K units in the first hidden layer, is analyzed using techniques from statistical mechanics. We found that the storage capacity strongly depends on the network architecture αc ∼ (log K)1-1/2L and that the number of units K limits the number of possible hidden layers L through the relationship 2L - 1 < 2log K. © 2014 IOP Publishing Ltd.
Resumo:
One of the most pressing demands on electrophysiology applied to the diagnosis of epilepsy is the non-invasive localization of the neuronal generators responsible for brain electrical and magnetic fields (the so-called inverse problem). These neuronal generators produce primary currents in the brain, which together with passive currents give rise to the EEG signal. Unfortunately, the signal we measure on the scalp surface doesn't directly indicate the location of the active neuronal assemblies. This is the expression of the ambiguity of the underlying static electromagnetic inverse problem, partly due to the relatively limited number of independent measures available. A given electric potential distribution recorded at the scalp can be explained by the activity of infinite different configurations of intracranial sources. In contrast, the forward problem, which consists of computing the potential field at the scalp from known source locations and strengths with known geometry and conductivity properties of the brain and its layers (CSF/meninges, skin and skull), i.e. the head model, has a unique solution. The head models vary from the computationally simpler spherical models (three or four concentric spheres) to the realistic models based on the segmentation of anatomical images obtained using magnetic resonance imaging (MRI). Realistic models – computationally intensive and difficult to implement – can separate different tissues of the head and account for the convoluted geometry of the brain and the significant inter-individual variability. In real-life applications, if the assumptions of the statistical, anatomical or functional properties of the signal and the volume in which it is generated are meaningful, a true three-dimensional tomographic representation of sources of brain electrical activity is possible in spite of the ‘ill-posed’ nature of the inverse problem (Michel et al., 2004). The techniques used to achieve this are now referred to as electrical source imaging (ESI) or magnetic source imaging (MSI). The first issue to influence reconstruction accuracy is spatial sampling, i.e. the number of EEG electrodes. It has been shown that this relationship is not linear, reaching a plateau at about 128 electrodes, provided spatial distribution is uniform. The second factor is related to the different properties of the source localization strategies used with respect to the hypothesized source configuration.
Resumo:
We consider the process of opinion formation in a society of interacting agents, where there is a set B of socially accepted rules. In this scenario, we observed that agents, represented by simple feed-forward, adaptive neural networks, may have a conservative attitude (mostly in agreement with B) or liberal attitude (mostly in agreement with neighboring agents) depending on how much their opinions are influenced by their peers. The topology of the network representing the interaction of the society's members is determined by a graph, where the agents' properties are defined over the vertexes and the interagent interactions are defined over the bonds. The adaptability of the agents allows us to model the formation of opinions as an online learning process, where agents learn continuously as new information becomes available to the whole society (online learning). Through the application of statistical mechanics techniques we deduced a set of differential equations describing the dynamics of the system. We observed that by slowly varying the average peer influence in such a way that the agents attitude changes from conservative to liberal and back, the average social opinion develops a hysteresis cycle. Such hysteretic behavior disappears when the variance of the social influence distribution is large enough. In all the cases studied, the change from conservative to liberal behavior is characterized by the emergence of conservative clusters, i.e., a closed knitted set of society members that follow a leader who agrees with the social status quo when the rule B is challenged.
Resumo:
We explore the dependence of performance measures, such as the generalization error and generalization consistency, on the structure and the parameterization of the prior on `rules', instanced here by the noisy linear perceptron. Using a statistical mechanics framework, we show how one may assign values to the parameters of a model for a `rule' on the basis of data instancing the rule. Information about the data, such as input distribution, noise distribution and other `rule' characteristics may be embedded in the form of general gaussian priors for improving net performance. We examine explicitly two types of general gaussian priors which are useful in some simple cases. We calculate the optimal values for the parameters of these priors and show their effect in modifying the most probable, MAP, values for the rules.
Resumo:
An adaptive back-propagation algorithm is studied and compared with gradient descent (standard back-propagation) for on-line learning in two-layer neural networks with an arbitrary number of hidden units. Within a statistical mechanics framework, both numerical studies and a rigorous analysis show that the adaptive back-propagation method results in faster training by breaking the symmetry between hidden units more efficiently and by providing faster convergence to optimal generalization than gradient descent.
Resumo:
The learning properties of a universal approximator, a normalized committee machine with adjustable biases, are studied for on-line back-propagation learning. Within a statistical mechanics framework, numerical studies show that this model has features which do not exist in previously studied two-layer network models without adjustable biases, e.g., attractive suboptimal symmetric phases even for realizable cases and noiseless data.
Resumo:
We present a method for determining the globally optimal on-line learning rule for a soft committee machine under a statistical mechanics framework. This rule maximizes the total reduction in generalization error over the whole learning process. A simple example demonstrates that the locally optimal rule, which maximizes the rate of decrease in generalization error, may perform poorly in comparison.
Resumo:
The influence of biases on the learning dynamics of a two-layer neural network, a normalized soft-committee machine, is studied for on-line gradient descent learning. Within a statistical mechanics framework, numerical studies show that the inclusion of adjustable biases dramatically alters the learning dynamics found previously. The symmetric phase which has often been predominant in the original model all but disappears for a non-degenerate bias task. The extended model furthermore exhibits a much richer dynamical behavior, e.g. attractive suboptimal symmetric phases even for realizable cases and noiseless data.
Resumo:
We analyse the dynamics of a number of second order on-line learning algorithms training multi-layer neural networks, using the methods of statistical mechanics. We first consider on-line Newton's method, which is known to provide optimal asymptotic performance. We determine the asymptotic generalization error decay for a soft committee machine, which is shown to compare favourably with the result for standard gradient descent. Matrix momentum provides a practical approximation to this method by allowing an efficient inversion of the Hessian. We consider an idealized matrix momentum algorithm which requires access to the Hessian and find close correspondence with the dynamics of on-line Newton's method. In practice, the Hessian will not be known on-line and we therefore consider matrix momentum using a single example approximation to the Hessian. In this case good asymptotic performance may still be achieved, but the algorithm is now sensitive to parameter choice because of noise in the Hessian estimate. On-line Newton's method is not appropriate during the transient learning phase, since a suboptimal unstable fixed point of the gradient descent dynamics becomes stable for this algorithm. A principled alternative is to use Amari's natural gradient learning algorithm and we show how this method provides a significant reduction in learning time when compared to gradient descent, while retaining the asymptotic performance of on-line Newton's method.