14 resultados para Gaussian scale mixture
em Aston University Research Archive
Resumo:
To make vision possible, the visual nervous system must represent the most informative features in the light pattern captured by the eye. Here we use Gaussian scale-space theory to derive a multiscale model for edge analysis and we test it in perceptual experiments. At all scales there are two stages of spatial filtering. An odd-symmetric, Gaussian first derivative filter provides the input to a Gaussian second derivative filter. Crucially, the output at each stage is half-wave rectified before feeding forward to the next. This creates nonlinear channels selectively responsive to one edge polarity while suppressing spurious or "phantom" edges. The two stages have properties analogous to simple and complex cells in the visual cortex. Edges are found as peaks in a scale-space response map that is the output of the second stage. The position and scale of the peak response identify the location and blur of the edge. The model predicts remarkably accurately our results on human perception of edge location and blur for a wide range of luminance profiles, including the surprising finding that blurred edges look sharper when their length is made shorter. The model enhances our understanding of early vision by integrating computational, physiological, and psychophysical approaches. © ARVO.
Resumo:
Mixture Density Networks are a principled method to model conditional probability density functions which are non-Gaussian. This is achieved by modelling the conditional distribution for each pattern with a Gaussian Mixture Model for which the parameters are generated by a neural network. This thesis presents a novel method to introduce regularisation in this context for the special case where the mean and variance of the spherical Gaussian Kernels in the mixtures are fixed to predetermined values. Guidelines for how these parameters can be initialised are given, and it is shown how to apply the evidence framework to mixture density networks to achieve regularisation. This also provides an objective stopping criteria that can replace the `early stopping' methods that have previously been used. If the neural network used is an RBF network with fixed centres this opens up new opportunities for improved initialisation of the network weights, which are exploited to start training relatively close to the optimum. The new method is demonstrated on two data sets. The first is a simple synthetic data set while the second is a real life data set, namely satellite scatterometer data used to infer the wind speed and wind direction near the ocean surface. For both data sets the regularisation method performs well in comparison with earlier published results. Ideas on how the constraint on the kernels may be relaxed to allow fully adaptable kernels are presented.
Resumo:
We have proposed a novel robust inversion-based neurocontroller that searches for the optimal control law by sampling from the estimated Gaussian distribution of the inverse plant model. However, for problems involving the prediction of continuous variables, a Gaussian model approximation provides only a very limited description of the properties of the inverse model. This is usually the case for problems in which the mapping to be learned is multi-valued or involves hysteritic transfer characteristics. This often arises in the solution of inverse plant models. In order to obtain a complete description of the inverse model, a more general multicomponent distributions must be modeled. In this paper we test whether our proposed sampling approach can be used when considering an arbitrary conditional probability distributions. These arbitrary distributions will be modeled by a mixture density network. Importance sampling provides a structured and principled approach to constrain the complexity of the search space for the ideal control law. The effectiveness of the importance sampling from an arbitrary conditional probability distribution will be demonstrated using a simple single input single output static nonlinear system with hysteretic characteristics in the inverse plant model.
Resumo:
Mixture Density Networks are a principled method to model conditional probability density functions which are non-Gaussian. This is achieved by modelling the conditional distribution for each pattern with a Gaussian Mixture Model for which the parameters are generated by a neural network. This thesis presents a novel method to introduce regularisation in this context for the special case where the mean and variance of the spherical Gaussian Kernels in the mixtures are fixed to predetermined values. Guidelines for how these parameters can be initialised are given, and it is shown how to apply the evidence framework to mixture density networks to achieve regularisation. This also provides an objective stopping criteria that can replace the `early stopping' methods that have previously been used. If the neural network used is an RBF network with fixed centres this opens up new opportunities for improved initialisation of the network weights, which are exploited to start training relatively close to the optimum. The new method is demonstrated on two data sets. The first is a simple synthetic data set while the second is a real life data set, namely satellite scatterometer data used to infer the wind speed and wind direction near the ocean surface. For both data sets the regularisation method performs well in comparison with earlier published results. Ideas on how the constraint on the kernels may be relaxed to allow fully adaptable kernels are presented.
Resumo:
A multi-scale model of edge coding based on normalized Gaussian derivative filters successfully predicts perceived scale (blur) for a wide variety of edge profiles [Georgeson, M. A., May, K. A., Freeman, T. C. A., & Hesse, G. S. (in press). From filters to features: Scale-space analysis of edge and blur coding in human vision. Journal of Vision]. Our model spatially differentiates the luminance profile, half-wave rectifies the 1st derivative, and then differentiates twice more, to give the 3rd derivative of all regions with a positive gradient. This process is implemented by a set of Gaussian derivative filters with a range of scales. Peaks in the inverted normalized 3rd derivative across space and scale indicate the positions and scales of the edges. The edge contrast can be estimated from the height of the peak. The model provides a veridical estimate of the scale and contrast of edges that have a Gaussian integral profile. Therefore, since scale and contrast are independent stimulus parameters, the model predicts that the perceived value of either of these parameters should be unaffected by changes in the other. This prediction was found to be incorrect: reducing the contrast of an edge made it look sharper, and increasing its scale led to a decrease in the perceived contrast. Our model can account for these effects when the simple half-wave rectifier after the 1st derivative is replaced by a smoothed threshold function described by two parameters. For each subject, one pair of parameters provided a satisfactory fit to the data from all the experiments presented here and in the accompanying paper [May, K. A. & Georgeson, M. A. (2007). Added luminance ramp alters perceived edge blur and contrast: A critical test for derivative-based models of edge coding. Vision Research, 47, 1721-1731]. Thus, when we allow for the visual system's insensitivity to very shallow luminance gradients, our multi-scale model can be extended to edge coding over a wide range of contrasts and blurs. © 2007 Elsevier Ltd. All rights reserved.
Resumo:
We studied the visual mechanisms that encode edge blur in images. Our previous work suggested that the visual system spatially differentiates the luminance profile twice to create the `signature' of the edge, and then evaluates the spatial scale of this signature profile by applying Gaussian derivative templates of different sizes. The scale of the best-fitting template indicates the blur of the edge. In blur-matching experiments, a staircase procedure was used to adjust the blur of a comparison edge (40% contrast, 0.3 s duration) until it appeared to match the blur of test edges at different contrasts (5% - 40%) and blurs (6 - 32 min of arc). Results showed that lower-contrast edges looked progressively sharper. We also added a linear luminance gradient to blurred test edges. When the added gradient was of opposite polarity to the edge gradient, it made the edge look progressively sharper. Both effects can be explained quantitatively by the action of a half-wave rectifying nonlinearity that sits between the first and second (linear) differentiating stages. This rectifier was introduced to account for a range of other effects on perceived blur (Barbieri-Hesse and Georgeson, 2002 Perception 31 Supplement, 54), but it readily predicts the influence of the negative ramp. The effect of contrast arises because the rectifier has a threshold: it not only suppresses negative values but also small positive values. At low contrasts, more of the gradient profile falls below threshold and its effective spatial scale shrinks in size, leading to perceived sharpening.
Resumo:
Marr's work offered guidelines on how to investigate vision (the theory - algorithm - implementation distinction), as well as specific proposals on how vision is done. Many of the latter have inevitably been superseded, but the approach was inspirational and remains so. Marr saw the computational study of vision as tightly linked to psychophysics and neurophysiology, but the last twenty years have seen some weakening of that integration. Because feature detection is a key stage in early human vision, we have returned to basic questions about representation of edges at coarse and fine scales. We describe an explicit model in the spirit of the primal sketch, but tightly constrained by psychophysical data. Results from two tasks (location-marking and blur-matching) point strongly to the central role played by second-derivative operators, as proposed by Marr and Hildreth. Edge location and blur are evaluated by finding the location and scale of the Gaussian-derivative `template' that best matches the second-derivative profile (`signature') of the edge. The system is scale-invariant, and accurately predicts blur-matching data for a wide variety of 1-D and 2-D images. By finding the best-fitting scale, it implements a form of local scale selection and circumvents the knotty problem of integrating filter outputs across scales. [Supported by BBSRC and the Wellcome Trust]
Resumo:
Perception of Mach bands may be explained by spatial filtering ('lateral inhibition') that can be approximated by 2nd derivative computation, and several alternative models have been proposed. To distinguish between them, we used a novel set of ‘generalised Gaussian’ images, in which the sharp ramp-plateau junction of the Mach ramp was replaced by smoother transitions. The images ranged from a slightly blurred Mach ramp to a Gaussian edge and beyond, and also included a sine-wave edge. The probability of seeing Mach Bands increased with the (relative) sharpness of the junction, but was largely independent of absolute spatial scale. These data did not fit the predictions of MIRAGE, nor 2nd derivative computation at a single fine scale. In experiment 2, observers used a cursor to mark features on the same set of images. Data on perceived position of Mach bands did not support the local energy model. Perceived width of Mach bands was poorly explained by a single-scale edge detection model, despite its previous success with Mach edges (Wallis & Georgeson, 2009, Vision Research, 49, 1886-1893). A more successful model used separate (odd and even) scale-space filtering for edges and bars, local peak detection to find candidate features, and the MAX operator to compare odd- and even-filter response maps (Georgeson, VSS 2006, Journal of Vision 6(6), 191a). Mach bands are seen when there is a local peak in the even-filter (bar) response map, AND that peak value exceeds corresponding responses in the odd-filter (edge) maps.
Resumo:
Adapting to blurred or sharpened images alters perceived blur of a focused image (M. A. Webster, M. A. Georgeson, & S. M. Webster, 2002). We asked whether blur adaptation results in (a) renormalization of perceived focus or (b) a repulsion aftereffect. Images were checkerboards or 2-D Gaussian noise, whose amplitude spectra had (log-log) slopes from -2 (strongly blurred) to 0 (strongly sharpened). Observers adjusted the spectral slope of a comparison image to match different test slopes after adaptation to blurred or sharpened images. Results did not show repulsion effects but were consistent with some renormalization. Test blur levels at and near a blurred or sharpened adaptation level were matched by more focused slopes (closer to 1/f) but with little or no change in appearance after adaptation to focused (1/f) images. A model of contrast adaptation and blur coding by multiple-scale spatial filters predicts these blur aftereffects and those of Webster et al. (2002). A key proposal is that observers are pre-adapted to natural spectra, and blurred or sharpened spectra induce changes in the state of adaptation. The model illustrates how norms might be encoded and recalibrated in the visual system even when they are represented only implicitly by the distribution of responses across multiple channels.
Resumo:
Projection of a high-dimensional dataset onto a two-dimensional space is a useful tool to visualise structures and relationships in the dataset. However, a single two-dimensional visualisation may not display all the intrinsic structure. Therefore, hierarchical/multi-level visualisation methods have been used to extract more detailed understanding of the data. Here we propose a multi-level Gaussian process latent variable model (MLGPLVM). MLGPLVM works by segmenting data (with e.g. K-means, Gaussian mixture model or interactive clustering) in the visualisation space and then fitting a visualisation model to each subset. To measure the quality of multi-level visualisation (with respect to parent and child models), metrics such as trustworthiness, continuity, mean relative rank errors, visualisation distance distortion and the negative log-likelihood per point are used. We evaluate the MLGPLVM approach on the ‘Oil Flow’ dataset and a dataset of protein electrostatic potentials for the ‘Major Histocompatibility Complex (MHC) class I’ of humans. In both cases, visual observation and the quantitative quality measures have shown better visualisation at lower levels.
Resumo:
Rotation invariance is important for an iris recognition system since changes of head orientation and binocular vergence may cause eye rotation. The conventional methods of iris recognition cannot achieve true rotation invariance. They only achieve approximate rotation invariance by rotating the feature vector before matching or unwrapping the iris ring at different initial angles. In these methods, the complexity of the method is increased, and when the rotation scale is beyond the certain scope, the error rates of these methods may substantially increase. In order to solve this problem, a new rotation invariant approach for iris feature extraction based on the non-separable wavelet is proposed in this paper. Firstly, a bank of non-separable orthogonal wavelet filters is used to capture characteristics of the iris. Secondly, a method of Markov random fields is used to capture rotation invariant iris feature. Finally, two-class kernel Fisher classifiers are adopted for classification. Experimental results on public iris databases show that the proposed approach has a low error rate and achieves true rotation invariance. © 2010.
Resumo:
In nonlinear and stochastic control problems, learning an efficient feed-forward controller is not amenable to conventional neurocontrol methods. For these approaches, estimating and then incorporating uncertainty in the controller and feed-forward models can produce more robust control results. Here, we introduce a novel inversion-based neurocontroller for solving control problems involving uncertain nonlinear systems which could also compensate for multi-valued systems. The approach uses recent developments in neural networks, especially in the context of modelling statistical distributions, which are applied to forward and inverse plant models. Provided that certain conditions are met, an estimate of the intrinsic uncertainty for the outputs of neural networks can be obtained using the statistical properties of networks. More generally, multicomponent distributions can be modelled by the mixture density network. Based on importance sampling from these distributions a novel robust inverse control approach is obtained. This importance sampling provides a structured and principled approach to constrain the complexity of the search space for the ideal control law. The developed methodology circumvents the dynamic programming problem by using the predicted neural network uncertainty to localise the possible control solutions to consider. A nonlinear multi-variable system with different delays between the input-output pairs is used to demonstrate the successful application of the developed control algorithm. The proposed method is suitable for redundant control systems and allows us to model strongly non-Gaussian distributions of control signal as well as processes with hysteresis. © 2004 Elsevier Ltd. All rights reserved.
Resumo:
Heterogeneous datasets arise naturally in most applications due to the use of a variety of sensors and measuring platforms. Such datasets can be heterogeneous in terms of the error characteristics and sensor models. Treating such data is most naturally accomplished using a Bayesian or model-based geostatistical approach; however, such methods generally scale rather badly with the size of dataset, and require computationally expensive Monte Carlo based inference. Recently within the machine learning and spatial statistics communities many papers have explored the potential of reduced rank representations of the covariance matrix, often referred to as projected or fixed rank approaches. In such methods the covariance function of the posterior process is represented by a reduced rank approximation which is chosen such that there is minimal information loss. In this paper a sequential Bayesian framework for inference in such projected processes is presented. The observations are considered one at a time which avoids the need for high dimensional integrals typically required in a Bayesian approach. A C++ library, gptk, which is part of the INTAMAP web service, is introduced which implements projected, sequential estimation and adds several novel features. In particular the library includes the ability to use a generic observation operator, or sensor model, to permit data fusion. It is also possible to cope with a range of observation error characteristics, including non-Gaussian observation errors. Inference for the covariance parameters is explored, including the impact of the projected process approximation on likelihood profiles. We illustrate the projected sequential method in application to synthetic and real datasets. Limitations and extensions are discussed. © 2010 Elsevier Ltd.
Resumo:
Some color centers in diamond can serve as quantum bits which can be manipulated with microwave pulses and read out with laser, even at room temperature. However, the photon collection efficiency of bulk diamond is greatly reduced by refraction at the diamond/air interface. To address this issue, we fabricated arrays of diamond nanostructures, differing in both diameter and top end shape, with HSQ and Cr as the etching mask materials, aiming toward large scale fabrication of single-photon sources with enhanced collection efficiency made of nitrogen vacancy (NV) embedded diamond. With a mixture of O2 and CHF3 gas plasma, diamond pillars with diameters down to 45 nm were obtained. The top end shape evolution has been represented with a simple model. The tests of size dependent single-photon properties confirmed an improved single-photon collection efficiency enhancement, larger than tenfold, and a mild decrease of decoherence time with decreasing pillar diameter was observed as expected. These results provide useful information for future applications of nanostructured diamond as a single-photon source.