856 resultados para hierarchical template matching
Resumo:
How do signals from the 2 eyes combine and interact? Our recent work has challenged earlier schemes in which monocular contrast signals are subject to square-law transduction followed by summation across eyes and binocular gain control. Much more successful was a new 'two-stage' model in which the initial transducer was almost linear and contrast gain control occurred both pre- and post-binocular summation. Here we extend that work by: (i) exploring the two-dimensional stimulus space (defined by left- and right-eye contrasts) more thoroughly, and (ii) performing contrast discrimination and contrast matching tasks for the same stimuli. Twenty-five base-stimuli made from 1 c/deg patches of horizontal grating, were defined by the factorial combination of 5 contrasts for the left eye (0.3-32%) with five contrasts for the right eye (0.3-32%). Other than in contrast, the gratings in the two eyes were identical. In a 2IFC discrimination task, the base-stimuli were masks (pedestals), where the contrast increment was presented to one eye only. In a matching task, the base-stimuli were standards to which observers matched the contrast of either a monocular or binocular test grating. In the model, discrimination depends on the local gradient of the observer's internal contrast-response function, while matching equates the magnitude (rather than gradient) of response to the test and standard. With all model parameters fixed by previous work, the two-stage model successfully predicted both the discrimination and the matching data and was much more successful than linear or quadratic binocular summation models. These results show that performance measures and perception (contrast discrimination and contrast matching) can be understood in the same theoretical framework for binocular contrast vision. © 2007 VSP.
Resumo:
The pattern of illumination on an undulating surface can be used to infer its 3-D form (shape from shading). But the recovery of shape would be invalid if the shading actually arose from reflectance variation. When a corrugated surface is painted with an albedo texture, the variation in local mean luminance (LM) due to shading is accompanied by a similar modulation in texture amplitude (AM). This is not so for reflectance variation, nor for roughly textured surfaces. We used a haptic matching technique to show that modulations of texture amplitude play a role in the interpretation of shape from shading. Observers were shown plaid stimuli comprising LM and AM combined in-phase (LM+AM) on one oblique and in anti-phase (LM-AM) on the other. Stimuli were presented via a modified ReachIN workstation allowing the co-registration of visual and haptic stimuli. In the first experiment, observers were asked to adjust the phase of a haptic surface, which had the same orientation as the LM+AM combination, until its peak in depth aligned with the visually perceived peak. The resulting alignments were consistent with the use of a lighting-from-above prior. In the second experiment, observers were asked to adjust the amplitude of the haptic surface to match that of the visually perceived surface. Observers chose relatively large amplitude settings when the haptic surface was oriented and phase-aligned with the LM+AM cue. When the haptic surface was aligned with the LM-AM cue, amplitude settings were close to zero. Thus the LM/AM phase relation is a significant visual depth cue, and is used to discriminate between shading and reflectance variations. [Supported by the Engineering and Physical Sciences Research Council, EPSRC].
Resumo:
We have shown previously that a template model for edge perception successfully predicts perceived blur for a variety of edge profiles (Georgeson, 2001 Journal of Vision 1 438a; Barbieri-Hesse and Georgeson, 2002 Perception 31 Supplement, 54). This study concerns the perceived contrast of edges. Our model spatially differentiates the luminance profile, half-wave rectifies this first derivative, and then differentiates again to create the edge's 'signature'. The spatial scale of the signature is evaluated by filtering it with a set of Gaussian derivative operators. This process finds the correlation between the signature and each operator kernel at each position. These kernels therefore act as templates, and the position and scale of the best-fitting template indicate the position and blur of the edge. Our previous finding, that reducing edge contrast reduces perceived blur, can be explained by replacing the half-wave rectifier with a smooth, biased rectifier function (May and Georgeson, 2003 Perception 32 388; May and Georgeson, 2003 Perception 32 Supplement, 46). With the half-wave rectifier, the peak template response R to a Gaussian edge with contrast C and scale s is given by: R=Cp-1/4s-3/2. Hence, edge contrast can be estimated from response magnitude and blur: C=Rp1/4s3/2. Use of this equation with the modified rectifier predicts that perceived contrast will decrease with increasing blur, particularly at low contrasts. Contrast-matching experiments supported this prediction. In addition, the model correctly predicts the perceived contrast of Gaussian edges modified either by spatial truncation or by the addition of a ramp.
Resumo:
We studied the visual mechanisms that encode edge blur in images. Our previous work suggested that the visual system spatially differentiates the luminance profile twice to create the `signature' of the edge, and then evaluates the spatial scale of this signature profile by applying Gaussian derivative templates of different sizes. The scale of the best-fitting template indicates the blur of the edge. In blur-matching experiments, a staircase procedure was used to adjust the blur of a comparison edge (40% contrast, 0.3 s duration) until it appeared to match the blur of test edges at different contrasts (5% - 40%) and blurs (6 - 32 min of arc). Results showed that lower-contrast edges looked progressively sharper. We also added a linear luminance gradient to blurred test edges. When the added gradient was of opposite polarity to the edge gradient, it made the edge look progressively sharper. Both effects can be explained quantitatively by the action of a half-wave rectifying nonlinearity that sits between the first and second (linear) differentiating stages. This rectifier was introduced to account for a range of other effects on perceived blur (Barbieri-Hesse and Georgeson, 2002 Perception 31 Supplement, 54), but it readily predicts the influence of the negative ramp. The effect of contrast arises because the rectifier has a threshold: it not only suppresses negative values but also small positive values. At low contrasts, more of the gradient profile falls below threshold and its effective spatial scale shrinks in size, leading to perceived sharpening.
Resumo:
We studied the visual mechanisms that encode edge blur in images. Our previous work suggested that the visual system spatially differentiates the luminance profile twice to create the 'signature' of the edge, and then evaluates the spatial scale of this signature profile by applying Gaussian derivative templates of different sizes. The scale of the best-fitting template indicates the blur of the edge. In blur-matching experiments, a staircase procedure was used to adjust the blur of a comparison edge (40% contrast, 0.3 s duration) until it appeared to match the blur of test edges at different contrasts (5% - 40%) and blurs (6 - 32 min of arc). Results showed that lower-contrast edges looked progressively sharper.We also added a linear luminance gradient to blurred test edges. When the added gradient was of opposite polarity to the edge gradient, it made the edge look progressively sharper. Both effects can be explained quantitatively by the action of a half-wave rectifying nonlinearity that sits between the first and second (linear) differentiating stages. This rectifier was introduced to account for a range of other effects on perceived blur (Barbieri-Hesse and Georgeson, 2002 Perception 31 Supplement, 54), but it readily predicts the influence of the negative ramp. The effect of contrast arises because the rectifier has a threshold: it not only suppresses negative values but also small positive values. At low contrasts, more of the gradient profile falls below threshold and its effective spatial scale shrinks in size, leading to perceived sharpening.
Resumo:
Marr's work offered guidelines on how to investigate vision (the theory - algorithm - implementation distinction), as well as specific proposals on how vision is done. Many of the latter have inevitably been superseded, but the approach was inspirational and remains so. Marr saw the computational study of vision as tightly linked to psychophysics and neurophysiology, but the last twenty years have seen some weakening of that integration. Because feature detection is a key stage in early human vision, we have returned to basic questions about representation of edges at coarse and fine scales. We describe an explicit model in the spirit of the primal sketch, but tightly constrained by psychophysical data. Results from two tasks (location-marking and blur-matching) point strongly to the central role played by second-derivative operators, as proposed by Marr and Hildreth. Edge location and blur are evaluated by finding the location and scale of the Gaussian-derivative `template' that best matches the second-derivative profile (`signature') of the edge. The system is scale-invariant, and accurately predicts blur-matching data for a wide variety of 1-D and 2-D images. By finding the best-fitting scale, it implements a form of local scale selection and circumvents the knotty problem of integrating filter outputs across scales. [Supported by BBSRC and the Wellcome Trust]
Resumo:
Edge blur is an important perceptual cue, but how does the visual system encode the degree of blur at edges? Blur could be measured by the width of the luminance gradient profile, peak ^ trough separation in the 2nd derivative profile, or the ratio of 1st-to-3rd derivative magnitudes. In template models, the system would store a set of templates of different sizes and find which one best fits the `signature' of the edge. The signature could be the luminance profile itself, or one of its spatial derivatives. I tested these possibilities in blur-matching experiments. In a 2AFC staircase procedure, observers adjusted the blur of Gaussian edges (30% contrast) to match the perceived blur of various non-Gaussian test edges. In experiment 1, test stimuli were mixtures of 2 Gaussian edges (eg 10 and 30 min of arc blur) at the same location, while in experiment 2, test stimuli were formed from a blurred edge sharpened to different extents by a compressive transformation. Predictions of the various models were tested against the blur-matching data, but only one model was strongly supported. This was the template model, in which the input signature is the 2nd derivative of the luminance profile, and the templates are applied to this signature at the zero-crossings. The templates are Gaussian derivative receptive fields that covary in width and length to form a self-similar set (ie same shape, different sizes). This naturally predicts that shorter edges should look sharper. As edge length gets shorter, responses of longer templates drop more than shorter ones, and so the response distribution shifts towards shorter (smaller) templates, signalling a sharper edge. The data confirmed this, including the scale-invariance implied by self-similarity, and a good fit was obtained from templates with a length-to-width ratio of about 1. The simultaneous analysis of edge blur and edge location may offer a new solution to the multiscale problem in edge detection.
Resumo:
G protein-coupled receptors (GPCRs) play important physiological roles transducing extracellular signals into intracellular responses. Approximately 50% of all marketed drugs target a GPCR. There remains considerable interest in effectively predicting the function of a GPCR from its primary sequence.
Resumo:
It has been argued that a single two-dimensional visualization plot may not be sufficient to capture all of the interesting aspects of complex data sets, and therefore a hierarchical visualization system is desirable. In this paper we extend an existing locally linear hierarchical visualization system PhiVis (Bishop98a) in several directions: 1. We allow for em non-linear projection manifolds. The basic building block is the Generative Topographic Mapping. 2. We introduce a general formulation of hierarchical probabilistic models consisting of local probabilistic models organized in a hierarchical tree. General training equations are derived, regardless of the position of the model in the tree. 3. Using tools from differential geometry we derive expressions for local directionalcurvatures of the projection manifold. Like PhiVis, our system is statistically principled and is built interactively in a top-down fashion using the EM algorithm. It enables the user to interactively highlight those data in the parent visualization plot which are captured by a child model.We also incorporate into our system a hierarchical, locally selective representation of magnification factors and directional curvatures of the projection manifolds. Such information is important for further refinement of the hierarchical visualization plot, as well as for controlling the amount of regularization imposed on the local models. We demonstrate the principle of the approach on a toy data set andapply our system to two more complex 12- and 19-dimensional data sets.
Resumo:
This report presents and evaluates a novel idea for scalable lossy colour image coding with Matching Pursuit (MP) performed in a transform domain. The benefits of the idea of MP performed in the transform domain are analysed in detail. The main contribution of this work is extending MP with wavelets to colour coding and proposing a coding method. We exploit correlations between image subbands after wavelet transformation in RGB colour space. Then, a new and simple quantisation and coding scheme of colour MP decomposition based on Run Length Encoding (RLE), inspired by the idea of coding indexes in relational databases, is applied. As a final coding step arithmetic coding is used assuming uniform distributions of MP atom parameters. The target application is compression at low and medium bit-rates. Coding performance is compared to JPEG 2000 showing the potential to outperform the latter with more sophisticated than uniform data models for arithmetic coder. The results are presented for grayscale and colour coding of 12 standard test images.
Resumo:
Digital image processing is exploited in many diverse applications but the size of digital images places excessive demands on current storage and transmission technology. Image data compression is required to permit further use of digital image processing. Conventional image compression techniques based on statistical analysis have reached a saturation level so it is necessary to explore more radical methods. This thesis is concerned with novel methods, based on the use of fractals, for achieving significant compression of image data within reasonable processing time without introducing excessive distortion. Images are modelled as fractal data and this model is exploited directly by compression schemes. The validity of this is demonstrated by showing that the fractal complexity measure of fractal dimension is an excellent predictor of image compressibility. A method of fractal waveform coding is developed which has low computational demands and performs better than conventional waveform coding methods such as PCM and DPCM. Fractal techniques based on the use of space-filling curves are developed as a mechanism for hierarchical application of conventional techniques. Two particular applications are highlighted: the re-ordering of data during image scanning and the mapping of multi-dimensional data to one dimension. It is shown that there are many possible space-filling curves which may be used to scan images and that selection of an optimum curve leads to significantly improved data compression. The multi-dimensional mapping property of space-filling curves is used to speed up substantially the lookup process in vector quantisation. Iterated function systems are compared with vector quantisers and the computational complexity or iterated function system encoding is also reduced by using the efficient matching algcnithms identified for vector quantisers.
Resumo:
We present a new form of contrast masking in which the target is a patch of low spatial frequency grating (0.46 c/deg) and the mask is a dark thin ring that surrounds the centre of the target patch. In matching and detection experiments we found little or no effect for binocular presentation of mask and test stimuli. But when mask and test were presented briefly (33 or 200 ms) to different eyes (dichoptic presentation), masking was substantial. In a 'half-binocular' condition the test stimulus was presented to one eye, but the mask stimulus was presented to both eyes with zero-disparity. This produced masking effects intermediate to those found in dichoptic and full-binocular conditions. We suggest that interocular feature matching can attenuate the potency of interocular suppression, but unlike in previous work (McKee, S. P., Bravo, M. J., Taylor, D. G., & Legge, G. E. (1994) Stereo matching precedes dichoptic masking. Vision Research, 34, 1047) we do not invoke a special role for depth perception. © 2004 Elsevier Ltd. All rights reserved.
Resumo:
Keyword identification in one of two simultaneous sentences is improved when the sentences differ in F0, particularly when they are almost continuously voiced. Sentences of this kind were recorded, monotonised using PSOLA, and re-synthesised to give a range of harmonic ?F0s (0, 1, 3, and 10 semitones). They were additionally re-synthesised by LPC with the LPC residual frequency shifted by 25% of F0, to give excitation with inharmonic but regularly spaced components. Perceptual identification of frequency-shifted sentences showed a similar large improvement with nominal ?F0 as seen for harmonic sentences, although overall performance was about 10% poorer. We compared performance with that of two autocorrelation-based computational models comprising four stages: (i) peripheral frequency selectivity and half-wave rectification; (ii) within-channel periodicity extraction; (iii) identification of the two major peaks in the summary autocorrelation function (SACF); (iv) a template-based approach to speech recognition using dynamic time warping. One model sampled the correlogram at the target-F0 period and performed spectral matching; the other deselected channels dominated by the interferer and performed matching on the short-lag portion of the residual SACF. Both models reproduced the monotonic increase observed in human performance with increasing ?F0 for the harmonic stimuli, but not for the frequency-shifted stimuli. A revised version of the spectral-matching model, which groups patterns of periodicity that lie on a curve in the frequency-delay plane, showed a closer match to the perceptual data for frequency-shifted sentences. The results extend the range of phenomena originally attributed to harmonic processing to grouping by common spectral pattern.
Resumo:
Computer-Based Learning systems of one sort or another have been in existence for almost 20 years, but they have yet to achieve real credibility within Commerce, Industry or Education. A variety of reasons could be postulated for this, typically: - cost - complexity - inefficiency - inflexibility - tedium Obviously different systems deserve different levels and types of criticism, but it still remains true that Computer-Based Learning (CBL) is falling significantly short of its potential. Experience of a small, but highly successful CBL system within a large, geographically distributed industry (the National Coal Board) prompted an investigation into currently available packages, the original intention being to purchase the most suitable software and run it on existing computer hardware, alongside existing software systems. It became apparent that none of the available CBL packages were suitable, and a decision was taken to develop an in-house Computer-Assisted Instruction system according to the following criteria: - cheap to run; - easy to author course material; - easy to use; - requires no computing knowledge to use (as either an author or student) ; - efficient in the use of computer resources; - has a comprehensive range of facilities at all levels. This thesis describes the initial investigation, resultant observations and the design, development and implementation of the SCHOOL system. One of the principal characteristics c£ SCHOOL is that it uses a hierarchical database structure for the storage of course material - thereby providing inherently a great deal of the power, flexibility and efficiency originally required. Trials using the SCHOOL system on IBM 303X series equipment are also detailed, along with proposed and current development work on what is essentially an operational CBL system within a large-scale Industrial environment.