5 resultados para Local Minima
em Aston University Research Archive
Resumo:
Attractor properties of a popular discrete-time neural network model are illustrated through numerical simulations. The most complex dynamics is found to occur within particular ranges of parameters controlling the symmetry and magnitude of the weight matrix. A small network model is observed to produce fixed points, limit cycles, mode-locking, the Ruelle-Takens route to chaos, and the period-doubling route to chaos. Training algorithms for tuning this dynamical behaviour are discussed. Training can be an easy or difficult task, depending whether the problem requires the use of temporal information distributed over long time intervals. Such problems require training algorithms which can handle hidden nodes. The most prominent of these algorithms, back propagation through time, solves the temporal credit assignment problem in a way which can work only if the relevant information is distributed locally in time. The Moving Targets algorithm works for the more general case, but is computationally intensive, and prone to local minima.
Resumo:
A formalism recently introduced by Prugel-Bennett and Shapiro uses the methods of statistical mechanics to model the dynamics of genetic algorithms. To be of more general interest than the test cases they consider. In this paper, the technique is applied to the subset sum problem, which is a combinatorial optimization problem with a strongly non-linear energy (fitness) function and many local minima under single spin flip dynamics. It is a problem which exhibits an interesting dynamics, reminiscent of stabilizing selection in population biology. The dynamics are solved under certain simplifying assumptions and are reduced to a set of difference equations for a small number of relevant quantities. The quantities used are the population's cumulants, which describe its shape, and the mean correlation within the population, which measures the microscopic similarity of population members. Including the mean correlation allows a better description of the population than the cumulants alone would provide and represents a new and important extension of the technique. The formalism includes finite population effects and describes problems of realistic size. The theory is shown to agree closely to simulations of a real genetic algorithm and the mean best energy is accurately predicted.
Resumo:
We describe a method of recognizing handwritten digits by fitting generative models that are built from deformable B-splines with Gaussian ``ink generators'' spaced along the length of the spline. The splines are adjusted using a novel elastic matching procedure based on the Expectation Maximization (EM) algorithm that maximizes the likelihood of the model generating the data. This approach has many advantages. (1) After identifying the model most likely to have generated the data, the system not only produces a classification of the digit but also a rich description of the instantiation parameters which can yield information such as the writing style. (2) During the process of explaining the image, generative models can perform recognition driven segmentation. (3) The method involves a relatively small number of parameters and hence training is relatively easy and fast. (4) Unlike many other recognition schemes it does not rely on some form of pre-normalization of input images, but can handle arbitrary scalings, translations and a limited degree of image rotation. We have demonstrated our method of fitting models to images does not get trapped in poor local minima. The main disadvantage of the method is it requires much more computation than more standard OCR techniques.
Resumo:
In this thesis we use statistical physics techniques to study the typical performance of four families of error-correcting codes based on very sparse linear transformations: Sourlas codes, Gallager codes, MacKay-Neal codes and Kanter-Saad codes. We map the decoding problem onto an Ising spin system with many-spins interactions. We then employ the replica method to calculate averages over the quenched disorder represented by the code constructions, the arbitrary messages and the random noise vectors. We find, as the noise level increases, a phase transition between successful decoding and failure phases. This phase transition coincides with upper bounds derived in the information theory literature in most of the cases. We connect the practical decoding algorithm known as probability propagation with the task of finding local minima of the related Bethe free-energy. We show that the practical decoding thresholds correspond to noise levels where suboptimal minima of the free-energy emerge. Simulations of practical decoding scenarios using probability propagation agree with theoretical predictions of the replica symmetric theory. The typical performance predicted by the thermodynamic phase transitions is shown to be attainable in computation times that grow exponentially with the system size. We use the insights obtained to design a method to calculate the performance and optimise parameters of the high performance codes proposed by Kanter and Saad.
Resumo:
Congenital nystagmus is an ocular-motor disorder characterised by involuntary, conjugated and bilateral to and fro ocular oscillations. In this study a method to recognise automatically jerk waveform inside a congenital nystagmus recording and to compute foveation time and foveation position variability is presented. The recordings were performed with subjects looking at visual targets, presented in nine eye gaze positions; data were segmented into blocks corresponding to each gaze position. The nystagmus cycles were identified searching for local minima and maxima (SpEp sequence) in intervals centred on each slope change of the eye position signal (position criterion). The SpEp sequence was then refined using an adaptive threshold applied to the eye velocity signal; the outcome is a robust detection of each slow phase start point, fundamental to accurately compute some nystagmus parameters. A total of 1206 slow phases was used to compute the specificity in waveform recognition applying only the position criterion or adding the adaptive threshold; results showed an increase in negative predictive value of 25.1% using both features. The duration of each foveation window was measured on raw data or using an interpolating function of the congenital nystagmus slow phases; foveation time estimation less sensitive to noise was obtained in the second case. © 2010.