13 resultados para Scale invariant feature transform (SIFT)

em Aston University Research Archive


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Marr's work offered guidelines on how to investigate vision (the theory - algorithm - implementation distinction), as well as specific proposals on how vision is done. Many of the latter have inevitably been superseded, but the approach was inspirational and remains so. Marr saw the computational study of vision as tightly linked to psychophysics and neurophysiology, but the last twenty years have seen some weakening of that integration. Because feature detection is a key stage in early human vision, we have returned to basic questions about representation of edges at coarse and fine scales. We describe an explicit model in the spirit of the primal sketch, but tightly constrained by psychophysical data. Results from two tasks (location-marking and blur-matching) point strongly to the central role played by second-derivative operators, as proposed by Marr and Hildreth. Edge location and blur are evaluated by finding the location and scale of the Gaussian-derivative `template' that best matches the second-derivative profile (`signature') of the edge. The system is scale-invariant, and accurately predicts blur-matching data for a wide variety of 1-D and 2-D images. By finding the best-fitting scale, it implements a form of local scale selection and circumvents the knotty problem of integrating filter outputs across scales. [Supported by BBSRC and the Wellcome Trust]

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In human (D. H. Baker, T. S. Meese, & R. J. Summers, 2007b) and in cat (B. Li, M. R. Peterson, J. K. Thompson, T. Duong, & R. D. Freeman, 2005; F. Sengpiel & V. Vorobyov, 2005) there are at least two routes to cross-orientation suppression (XOS): a broadband, non-adaptable, monocular (within-eye) pathway and a more narrowband, adaptable interocular (between the eyes) pathway. We further characterized these two routes psychophysically by measuring the weight of suppression across spatio-temporal frequency for cross-oriented pairs of superimposed flickering Gabor patches. Masking functions were normalized to unmasked detection thresholds and fitted by a two-stage model of contrast gain control (T. S. Meese, M. A. Georgeson, & D. H. Baker, 2006) that was developed to accommodate XOS. The weight of monocular suppression was a power function of the scalar quantity ‘speed’ (temporal-frequency/spatial-frequency). This weight can be expressed as the ratio of non-oriented magno- and parvo-like mechanisms, permitting a fast-acting, early locus, as befits the urgency for action associated with high retinal speeds. In contrast, dichoptic-masking functions superimposed. Overall, this (i) provides further evidence for dissociation between the two forms of XOS in humans, and (ii) indicates that the monocular and interocular varieties of XOS are space/time scale-dependent and scale-invariant, respectively. This suggests an image-processing role for interocular XOS that is tailored to natural image statistics—very different from that of the scale-dependent (speed-dependent) monocular variety.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Ernst Mach observed that light or dark bands could be seen at abrupt changes of luminance gradient in the absence of peaks or troughs in luminance. Many models of feature detection share the idea that bars, lines, and Mach bands are found at peaks and troughs in the output of even-symmetric spatial filters. Our experiments assessed the appearance of Mach bands (position and width) and the probability of seeing them on a novel set of generalized Gaussian edges. Mach band probability was mainly determined by the shape of the luminance profile and increased with the sharpness of its corners, controlled by a single parameter (n). Doubling or halving the size of the images had no significant effect. Variations in contrast (20%-80%) and duration (50-300 ms) had relatively minor effects. These results rule out the idea that Mach bands depend simply on the amplitude of the second derivative, but a multiscale model, based on Gaussian-smoothed first- and second-derivative filtering, can account accurately for the probability and perceived spatial layout of the bands. A key idea is that Mach band visibility depends on the ratio of second- to first-derivative responses at peaks in the second-derivative scale-space map. This ratio is approximately scale-invariant and increases with the sharpness of the corners of the luminance ramp, as observed. The edges of Mach bands pose a surprisingly difficult challenge for models of edge detection, but a nonlinear third-derivative operation is shown to predict the locations of Mach band edges strikingly well. Mach bands thus shed new light on the role of multiscale filtering systems in feature coding. © 2012 ARVO.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The human visual system combines contrast information from the two eyes to produce a single cyclopean representation of the external world. This task requires both summation of congruent images and inhibition of incongruent images across the eyes. These processes were explored psychophysically using narrowband sinusoidal grating stimuli. Initial experiments focussed on binocular interactions within a single detecting mechanism, using contrast discrimination and contrast matching tasks. Consistent with previous findings, dichoptic presentation produced greater masking than monocular or binocular presentation. Four computational models were compared, two of which performed well on all data sets. Suppression between mechanisms was then investigated, using orthogonal and oblique stimuli. Two distinct suppressive pathways were identified, corresponding to monocular and dichoptic presentation. Both pathways impact prior to binocular summation of signals, and differ in their strengths, tuning, and response to adaptation, consistent with recent single-cell findings in cat. Strikingly, the magnitude of dichoptic masking was found to be spatiotemporally scale invariant, whereas monocular masking was dependent on stimulus speed. Interocular suppression was further explored using a novel manipulation, whereby stimuli were presented in dichoptic antiphase. Consistent with the predictions of a computational model, this produced weaker masking than in-phase presentation. This allowed the bandwidths of suppression to be measured without the complicating factor of additive combination of mask and test. Finally, contrast vision in strabismic amblyopia was investigated. Although amblyopes are generally believed to have impaired binocular vision, binocular summation was shown to be intact when stimuli were normalized for interocular sensitivity differences. An alternative account of amblyopia was developed, in which signals in the affected eye are subject to attenuation and additive noise prior to binocular combination.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Over the last ten years our understanding of early spatial vision has improved enormously. The long-standing model of probability summation amongst multiple independent mechanisms with static output nonlinearities responsible for masking is obsolete. It has been replaced by a much more complex network of additive, suppressive, and facilitatory interactions and nonlinearities across eyes, area, spatial frequency, and orientation that extend well beyond the classical recep-tive field (CRF). A review of a substantial body of psychophysical work performed by ourselves (20 papers), and others, leads us to the following tentative account of the processing path for signal contrast. The first suppression stage is monocular, isotropic, non-adaptable, accelerates with RMS contrast, most potent for low spatial and high temporal frequencies, and extends slightly beyond the CRF. Second and third stages of suppression are difficult to disentangle but are possibly pre- and post-binocular summation, and involve components that are scale invariant, isotropic, anisotropic, chromatic, achromatic, adaptable, interocular, substantially larger than the CRF, and saturated by contrast. The monocular excitatory pathways begin with half-wave rectification, followed by a preliminary stage of half-binocular summation, a square-law transducer, full binocular summation, pooling over phase, cross-mechanism facilitatory interactions, additive noise, linear summation over area, and a slightly uncertain decision-maker. The purpose of each of these interactions is far from clear, but the system benefits from area and binocular summation of weak contrast signals as well as area and ocularity invariances above threshold (a herd of zebras doesn't change its contrast when it increases in number or when you close one eye). One of many remaining challenges is to determine the stage or stages of spatial tuning in the excitatory pathway.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Over the full visual field, contrast sensitivity is fairly well described by a linear decline in log sensitivity as a function of eccentricity (expressed in grating cycles). However, many psychophysical studies of spatial visual function concentrate on the central ±4.5 deg (or so) of the visual field. As the details of the variation in sensitivity have not been well documented in this region we did so for small patches of target contrast at several spatial frequencies (0.7–4 c/deg), meridians (horizontal, vertical, and oblique), orientations (horizontal, vertical, and oblique), and eccentricities (0–18 cycles). To reduce the potential effects of stimulus uncertainty, circular markers surrounded the targets. Our analysis shows that the decline in binocular log sensitivity within the central visual field is bilinear: The initial decline is steep, whereas the later decline is shallow and much closer to the classical results. The bilinear decline was approximately symmetrical in the horizontal meridian and declined most steeply in the superior visual field. Further analyses showed our results to be scale-invariant and that this property could not be predicted from cone densities. We used the results from the cardinal meridians to radially interpolate an attenuation surface with the shape of a witch's hat that provided good predictions for the results from the oblique meridians. The witch's hat provides a convenient starting point from which to build models of contrast sensitivity, including those designed to investigate signal summation and neuronal convergence of the image contrast signal. Finally, we provide Matlab code for constructing the witch's hat.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The processing conducted by the visual system requires the combination of signals that are detected at different locations in the visual field. The processes by which these signals are combined are explored here using psychophysical experiments and computer modelling. Most of the work presented in this thesis is concerned with the summation of contrast over space at detection threshold. Previous investigations of this sort have been confounded by the inhomogeneity in contrast sensitivity across the visual field. Experiments performed in this thesis find that the decline in log contrast sensitivity with eccentricity is bilinear, with an initial steep fall-off followed by a shallower decline. This decline is scale-invariant for spatial frequencies of 0.7 to 4 c/deg. A detailed map of the inhomogeneity is developed, and applied to area summation experiments both by incorporating it into models of the visual system and by using it to compensate stimuli in order to factor out the effects of the inhomogeneity. The results of these area summation experiments show that the summation of contrast over area is spatially extensive (occurring over 33 stimulus carrier cycles), and that summation behaviour is the same in the fovea, parafovea, and periphery. Summation occurs according to a fourth-root summation rule, consistent with a “noisy energy” model. This work is extended to investigate the visual deficit in amblyopia, finding that area summation is normal in amblyopic observers. Finally, the methods used to study the summation of threshold contrast over area are adapted to investigate the integration of coherent orientation signals in a texture. The results of this study are described by a two-stage model, with a mandatory local combination stage followed by flexible global pooling of these local outputs. In each study, the results suggest a more extensive combination of signals in vision than has been previously understood.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Rotation invariance is important for an iris recognition system since changes of head orientation and binocular vergence may cause eye rotation. The conventional methods of iris recognition cannot achieve true rotation invariance. They only achieve approximate rotation invariance by rotating the feature vector before matching or unwrapping the iris ring at different initial angles. In these methods, the complexity of the method is increased, and when the rotation scale is beyond the certain scope, the error rates of these methods may substantially increase. In order to solve this problem, a new rotation invariant approach for iris feature extraction based on the non-separable wavelet is proposed in this paper. Firstly, a bank of non-separable orthogonal wavelet filters is used to capture characteristics of the iris. Secondly, a method of Markov random fields is used to capture rotation invariant iris feature. Finally, two-class kernel Fisher classifiers are adopted for classification. Experimental results on public iris databases show that the proposed approach has a low error rate and achieves true rotation invariance. © 2010.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Influential models of edge detection have generally supposed that an edge is detected at peaks in the 1st derivative of the luminance profile, or at zero-crossings in the 2nd derivative. However, when presented with blurred triangle-wave images, observers consistently marked edges not at these locations, but at peaks in the 3rd derivative. This new phenomenon, termed ‘Mach edges’ persisted when a luminance ramp was added to the blurred triangle-wave. Modelling of these Mach edge detection data required the addition of a physiologically plausible filter, prior to the 3rd derivative computation. A viable alternative model was examined, on the basis of data obtained with short-duration, high spatial-frequency stimuli. Detection and feature-making methods were used to examine the perception of Mach bands in an image set that spanned a range of Mach band detectabilities. A scale-space model that computed edge and bar features in parallel provided a better fit to the data than 4 competing models that combined information across scale in a different manner, or computed edge or bar features at a single scale. The perception of luminance bars was examined in 2 experiments. Data for one image-set suggested a simple rule for perception of a small Gaussian bar on a larger inverted Gaussian bar background. In previous research, discriminability (d’) has typically been reported to be a power function of contrast, where the exponent (p) is 2 to 3. However, using bar, grating, and Gaussian edge stimuli, with several methodologies, values of p were obtained that ranged from 1 to 1.7 across 6 experiments. This novel finding was explained by appealing to low stimulus uncertainty, or a near-linear transducer.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Fluidized bed spray granulators (FBMG) are widely used in the process industry for particle size growth; a desirable feature in many products, such as granulated food and medical tablets. In this paper, the first in a series of four discussing the rate of various microscopic events occurring in FBMG, theoretical analysis coupled with CFD simulations have been used to predict granule–granule and droplet–granule collision time scales. The granule–granule collision time scale was derived from principles of kinetic theory of granular flow (KTGF). For the droplet–granule collisions, two limiting models were derived; one is for the case of fast droplet velocity, where the granule velocity is considerable lower than that of the droplet (ballistic model) and another for the case where the droplet is traveling with a velocity similar to the velocity of the granules. The hydrodynamic parameters used in the solution of the above models were obtained from the CFD predictions for a typical spray fluidized bed system. The granule–granule collision rate within an identified spray zone was found to fall approximately within the range of 10-2–10-3 s, while the droplet–granule collision was found to be much faster, however, slowing rapidly (exponentially) when moving away from the spray nozzle tip. Such information, together with the time scale analysis of droplet solidification and spreading, discussed in part II and III of this study, are useful for probability analysis of the various event occurring during a granulation process, which then lead to be better qualitative and, in part IV, quantitative prediction of the aggregation rate.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

For the treatment and monitoring of Parkinson's disease (PD) to be scientific, a key requirement is that measurement of disease stages and severity is quantitative, reliable, and repeatable. The last 50 years in PD research have been dominated by qualitative, subjective ratings obtained by human interpretation of the presentation of disease signs and symptoms at clinical visits. More recently, “wearable,” sensor-based, quantitative, objective, and easy-to-use systems for quantifying PD signs for large numbers of participants over extended durations have been developed. This technology has the potential to significantly improve both clinical diagnosis and management in PD and the conduct of clinical studies. However, the large-scale, high-dimensional character of the data captured by these wearable sensors requires sophisticated signal processing and machine-learning algorithms to transform it into scientifically and clinically meaningful information. Such algorithms that “learn” from data have shown remarkable success in making accurate predictions for complex problems in which human skill has been required to date, but they are challenging to evaluate and apply without a basic understanding of the underlying logic on which they are based. This article contains a nontechnical tutorial review of relevant machine-learning algorithms, also describing their limitations and how these can be overcome. It discusses implications of this technology and a practical road map for realizing the full potential of this technology in PD research and practice. © 2016 International Parkinson and Movement Disorder Society.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Distributed representations (DR) of cortical channels are pervasive in models of spatio-temporal vision. A central idea that underpins current innovations of DR stems from the extension of 1-D phase into 2-D images. Neurophysiological evidence, however, provides tenuous support for a quadrature representation in the visual cortex, since even phase visual units are associated with broader orientation tuning than odd phase visual units (J.Neurophys.,88,455–463, 2002). We demonstrate that the application of the steering theorems to a 2-D definition of phase afforded by the Riesz Transform (IEEE Trans. Sig. Proc., 49, 3136–3144), to include a Scale Transform, allows one to smoothly interpolate across 2-D phase and pass from circularly symmetric to orientation tuned visual units, and from more narrowly tuned odd symmetric units to even ones. Steering across 2-D phase and scale can be orthogonalized via a linearizing transformation. Using the tiltafter effect as an example, we argue that effects of visual adaptation can be better explained by via an orthogonal rather than channel specific representation of visual units. This is because of the ability to explicitly account for isotropic and cross-orientation adaptation effect from the orthogonal representation from which both direct and indirect tilt after-effects can be explained.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Aircraft manufacturing industries are looking for solutions in order to increase their productivity. One of the solutions is to apply the metrology systems during the production and assembly processes. Metrology Process Model (MPM) (Maropoulos et al, 2007) has been introduced which emphasises metrology applications with assembly planning, manufacturing processes and product designing. Measurability analysis is part of the MPM and the aim of this analysis is to check the feasibility for measuring the designed large scale components. Measurability Analysis has been integrated in order to provide an efficient matching system. Metrology database is structured by developing the Metrology Classification Model. Furthermore, the feature-based selection model is also explained. By combining two classification models, a novel approach and selection processes for integrated measurability analysis system (MAS) are introduced and such integrated MAS could provide much more meaningful matching results for the operators. © Springer-Verlag Berlin Heidelberg 2010.