Biblioteca Digital

977 resultados para scale invariant

Wide-angle visual feature matching for outdoor localization

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Wide-angle images exhibit significant distortion for which existing scale-space detectors such as the scale-invariant feature transform (SIFT) are inappropriate. The required scale-space images for feature detection are correctly obtained through the convolution of the image, mapped to the sphere, with the spherical Gaussian. A new visual key-point detector, based on this principle, is developed and several computational approaches to the convolution are investigated in both the spatial and frequency domain. In particular, a close approximation is developed that has comparable computation time to conventional SIFT but with improved matching performance. Results are presented for monocular wide-angle outdoor image sequences obtained using fisheye and equiangular catadioptric cameras. We evaluate the overall matching performance (recall versus 1-precision) of these methods compared to conventional SIFT. We also demonstrate the use of the technique for variable frame-rate visual odometry and its application to place recognition.

Wide-baseline keypoint detection and matching with wide-angle images for vision based localisation

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This thesis addresses the problem of detecting and describing the same scene points in different wide-angle images taken by the same camera at different viewpoints. This is a core competency of many vision-based localisation tasks including visual odometry and visual place recognition. Wide-angle cameras have a large field of view that can exceed a full hemisphere, and the images they produce contain severe radial distortion. When compared to traditional narrow field of view perspective cameras, more accurate estimates of camera egomotion can be found using the images obtained with wide-angle cameras. The ability to accurately estimate camera egomotion is a fundamental primitive of visual odometry, and this is one of the reasons for the increased popularity in the use of wide-angle cameras for this task. Their large field of view also enables them to capture images of the same regions in a scene taken at very different viewpoints, and this makes them suited for visual place recognition. However, the ability to estimate the camera egomotion and recognise the same scene in two different images is dependent on the ability to reliably detect and describe the same scene points, or ‘keypoints’, in the images. Most algorithms used for this purpose are designed almost exclusively for perspective images. Applying algorithms designed for perspective images directly to wide-angle images is problematic as no account is made for the image distortion. The primary contribution of this thesis is the development of two novel keypoint detectors, and a method of keypoint description, designed for wide-angle images. Both reformulate the Scale- Invariant Feature Transform (SIFT) as an image processing operation on the sphere. As the image captured by any central projection wide-angle camera can be mapped to the sphere, applying these variants to an image on the sphere enables keypoints to be detected in a manner that is invariant to image distortion. Each of the variants is required to find the scale-space representation of an image on the sphere, and they differ in the approaches they used to do this. Extensive experiments using real and synthetically generated wide-angle images are used to validate the two new keypoint detectors and the method of keypoint description. The best of these two new keypoint detectors is applied to vision based localisation tasks including visual odometry and visual place recognition using outdoor wide-angle image sequences. As part of this work, the effect of keypoint coordinate selection on the accuracy of egomotion estimates using the Direct Linear Transform (DLT) is investigated, and a simple weighting scheme is proposed which attempts to account for the uncertainty of keypoint positions during detection. A word reliability metric is also developed for use within a visual ‘bag of words’ approach to place recognition.

A biologically inspired object spectral-texture descriptor and its application to vegetation classification in power-line corridors

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The use of appropriate features to represent an output class or object is critical for all classification problems. In this paper, we propose a biologically inspired object descriptor to represent the spectral-texture patterns of image-objects. The proposed feature descriptor is generated from the pulse spectral frequencies (PSF) of a pulse coupled neural network (PCNN), which is invariant to rotation, translation and small scale changes. The proposed method is first evaluated in a rotation and scale invariant texture classification using USC-SIPI texture database. It is further evaluated in an application of vegetation species classification in power line corridor monitoring using airborne multi-spectral aerial imagery. The results from the two experiments demonstrate that the PSF feature is effective to represent spectral-texture patterns of objects and it shows better results than classic color histogram and texture features.

Pattern recognition using invariants defined from higher order spectra : 2-D image inputs

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A new algorithm for extracting features from images for object recognition is described. The algorithm uses higher order spectra to provide desirable invariance properties, to provide noise immunity, and to incorporate nonlinearity into the feature extraction procedure thereby allowing the use of simple classifiers. An image can be reduced to a set of 1D functions via the Radon transform, or alternatively, the Fourier transform of each 1D projection can be obtained from a radial slice of the 2D Fourier transform of the image according to the Fourier slice theorem. A triple product of Fourier coefficients, referred to as the deterministic bispectrum, is computed for each 1D function and is integrated along radial lines in bifrequency space. Phases of the integrated bispectra are shown to be translation- and scale-invariant. Rotation invariance is achieved by a regrouping of these invariants at a constant radius followed by a second stage of invariant extraction. Rotation invariance is thus converted to translation invariance in the second step. Results using synthetic and actual images show that isolated, compact clusters are formed in feature space. These clusters are linearly separable, indicating that the nonlinearity required in the mapping from the input space to the classification space is incorporated well into the feature extraction stage. The use of higher order spectra results in good noise immunity, as verified with synthetic and real images. Classification of images using the higher order spectra-based algorithm compares favorably to classification using the method of moment invariants

Shape discrimination using invariants defined from higher-order spectra

Relevância:

60.00% 60.00%

Publicador:

Resumo:

An approach to pattern recognition using invariant parameters based on higher-order spectra is presented. In particular, bispectral invariants are used to classify one-dimensional shapes. The bispectrum, which is translation invariant, is integrated along straight lines passing through the origin in bifrequency space. The phase of the integrated bispectrum is shown to be scale- and amplification-invariant. A minimal set of these invariants is selected as the feature vector for pattern classification. Pattern recognition using higher-order spectral invariants is fast, suited for parallel implementation, and works for signals corrupted by Gaussian noise. The classification technique is shown to distinguish two similar but different bolts given their one-dimensional profiles

Self-organized criticality in dynamics without branching

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We demonstrate the phenomenon of self-organized criticality (SOC) in a simple random walk model described by a random walk of a myopic ant, i.e., a walker who can see only nearest neighbors. The ant acts on the underlying lattice aiming at uniform digging, i.e., reduction of the height profile of the surface but is unaffected by the underlying lattice. In one, two, and three dimensions we have explored this model and have obtained power laws in the time intervals between consecutive events of "digging." Being a simple random walk, the power laws in space translate to power laws in time. We also study the finite size scaling of asymptotic scale invariant process as well as dynamic scaling in this system. This model differs qualitatively from the cascade models of SOC.

Visual completion in an illusory figure

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The visual systems of humans and animals represent physical reality in a modified way, depending on the specific demands that the species in question has for survival. The ability to perceive visual illusions is found in independently evolved visual systems, from honeybees to humans. In humans, the ability emerges early, at the age of four months. Thus the perception of illusion is likely to reflect visual processes of fundamental importance for object perception in natural vision. The experiments reported in this thesis employed various modifications of the Kanizsa triangle, a drawn configuration composed of three black disks with missing sectors on a white background. The sectors appear to form the tips of a triangle. The visual system completes the physically empty area between the disks, generally called inducers, with giving the perception of an illusory triangle. The illusory triangle consists of an illusory surface bounded by illusory contours; the triangle appears brighter than and to lie above the background. If the sectors are coloured, the colour fills the illusory area, a phenomenon known as neon colour spreading . We investigated spatial limitations on the perception of Kanizsa-type illusions and how other stimuli and viewing parameters affected these limitations. We also studied complex configurations thick, bent, mobile and chromatic inducers - to determine whether illusions combining several attributes can be perceived. The results suggest that the visual system is highly effective in completing a percept. The perception of an illusory figure is spatially scale invariant when perceived at threshold. The processing time and the number of fixations modify the percept, making the perception of the illusion more probable in various viewing conditions. Furthermore, the fact that the illusion can be perceived when only one inducer is physically present at any given moment indicates the potential of single inducers. Apparently, modelling illusory figure perception will require a combination of low-level, local processes and higher-level integrative processes. Our studies with stimuli combining several attributes relevant to object perception demonstrate that the perception of an illusory figure is flexible and is maintained also when it contains colour and volume and when shown in movement. All in all, the results confirm the assumed importance of the visual processes related with the perception of illusory figures in everyday viewing. This is indicated by the variety of inducer modifications that can be made without destroying the percept. Furthermore, the illusion can acquire additional attributes from such modifications. Due to individual differences in the perception of illusory figures, universal values for absolute performance are not always meaningful, but stable trends and general relations do exist.

Visual search and eye movements: Studies of perceptual span

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In visual search one tries to find the currently relevant item among other, irrelevant items. In the present study, visual search performance for complex objects (characters, faces, computer icons and words) was investigated, and the contribution of different stimulus properties, such as luminance contrast between characters and background, set size, stimulus size, colour contrast, spatial frequency, and stimulus layout were investigated. Subjects were required to search for a target object among distracter objects in two-dimensional stimulus arrays. The outcome measure was threshold search time, that is, the presentation duration of the stimulus array required by the subject to find the target with a certain probability. It reflects the time used for visual processing separated from the time used for decision making and manual reactions. The duration of stimulus presentation was controlled by an adaptive staircase method. The number and duration of eye fixations, saccade amplitude, and perceptual span, i.e., the number of items that can be processed during a single fixation, were measured. It was found that search performance was correlated with the number of fixations needed to find the target. Search time and the number of fixations increased with increasing stimulus set size. On the other hand, several complex objects could be processed during a single fixation, i.e., within the perceptual span. Search time and the number of fixations depended on object type as well as luminance contrast. The size of the perceptual span was smaller for more complex objects, and decreased with decreasing luminance contrast within object type, especially for very low contrasts. In addition, the size and shape of perceptual span explained the changes in search performance for different stimulus layouts in word search. Perceptual span was scale invariant for a 16-fold range of stimulus sizes, i.e., the number of items processed during a single fixation was independent of retinal stimulus size or viewing distance. It is suggested that saccadic visual search consists of both serial (eye movements) and parallel (processing within perceptual span) components, and that the size of the perceptual span may explain the effectiveness of saccadic search in different stimulus conditions. Further, low-level visual factors, such as the anatomical structure of the retina, peripheral stimulus visibility and resolution requirements for the identification of different object types are proposed to constrain the size of the perceptual span, and thus, limit visual search performance. Similar methods were used in a clinical study to characterise the visual search performance and eye movements of neurological patients with chronic solvent-induced encephalopathy (CSE). In addition, the data about the effects of different stimulus properties on visual search in normal subjects were presented as simple practical guidelines, so that the limits of human visual perception could be taken into account in the design of user interfaces.

Spatial correlation in grain misorientation distribution

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Grain misorientation was studied in relation to the nearest neighbor's mutual distance using electron back-scattered diffraction measurements. The misorientation correlation function was defined as the probability density for the occurrence of a certain misorientation between pairs of grains separated by a certain distance. Scale-invariant spatial correlation between neighbor grains was manifested by a power law dependence of the preferred misorientation vs. inter-granular distance in various materials after diverse strain paths. The obtained negative scaling exponents were in the range of -2 +/- 0.3 for high-angle grain boundaries. The exponent decreased in the presence of low-angle grain boundaries or dynamic recrystallization, indicating faster decay of correlations. The correlations vanished in annealed materials. The results were interpreted in terms of lattice incompatibility and continuity conditions at the interface between neighboring grains. Grain-size effects on texture development, as well as the implications of such spatial correlations on texture modeling, were discussed.

Non-Gaussian Cosmological Perturbations from Hybrid Inflation and Preheating

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This thesis consists of four research papers and an introduction providing some background. The structure in the universe is generally considered to originate from quantum fluctuations in the very early universe. The standard lore of cosmology states that the primordial perturbations are almost scale-invariant, adiabatic, and Gaussian. A snapshot of the structure from the time when the universe became transparent can be seen in the cosmic microwave background (CMB). For a long time mainly the power spectrum of the CMB temperature fluctuations has been used to obtain observational constraints, especially on deviations from scale-invariance and pure adiabacity. Non-Gaussian perturbations provide a novel and very promising way to test theoretical predictions. They probe beyond the power spectrum, or two point correlator, since non-Gaussianity involves higher order statistics. The thesis concentrates on the non-Gaussian perturbations arising in several situations involving two scalar fields, namely, hybrid inflation and various forms of preheating. First we go through some basic concepts -- such as the cosmological inflation, reheating and preheating, and the role of scalar fields during inflation -- which are necessary for the understanding of the research papers. We also review the standard linear cosmological perturbation theory. The second order perturbation theory formalism for two scalar fields is developed. We explain what is meant by non-Gaussian perturbations, and discuss some difficulties in parametrisation and observation. In particular, we concentrate on the nonlinearity parameter. The prospects of observing non-Gaussianity are briefly discussed. We apply the formalism and calculate the evolution of the second order curvature perturbation during hybrid inflation. We estimate the amount of non-Gaussianity in the model and find that there is a possibility for an observational effect. The non-Gaussianity arising in preheating is also studied. We find that the level produced by the simplest model of instant preheating is insignificant, whereas standard preheating with parametric resonance as well as tachyonic preheating are prone to easily saturate and even exceed the observational limits. We also mention other approaches to the study of primordial non-Gaussianities, which differ from the perturbation theory method chosen in the thesis work.

Vision based indoor positioning in a retail environment

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Modern smart phones often come with a significant amount of computational power and an integrated digital camera making them an ideal platform for intelligents assistants. This work is restricted to retail environments, where users could be provided with for example navigational in- structions to desired products or information about special offers within their close proximity. This kind of applications usually require information about the user's current location in the domain environment, which in our case corresponds to a retail store. We propose a vision based positioning approach that recognizes products the user's mobile phone's camera is currently pointing at. The products are related to locations within the store, which enables us to locate the user by pointing the mobile phone's camera to a group of products. The first step of our method is to extract meaningful features from digital images. We use the Scale- Invariant Feature Transform SIFT algorithm, which extracts features that are highly distinctive in the sense that they can be correctly matched against a large database of features from many images. We collect a comprehensive set of images from all meaningful locations within our domain and extract the SIFT features from each of these images. As the SIFT features are of high dimensionality and thus comparing individual features is infeasible, we apply the Bags of Keypoints method which creates a generic representation, visual category, from all features extracted from images taken from a specific location. A category for an unseen image can be deduced by extracting the corresponding SIFT features and by choosing the category that best fits the extracted features. We have applied the proposed method within a Finnish supermarket. We consider grocery shelves as categories which is a sufficient level of accuracy to help users navigate or to provide useful information about nearby products. We achieve a 40% accuracy which is quite low for commercial applications while significantly outperforming the random guess baseline. Our results suggest that the accuracy of the classification could be increased with a deeper analysis on the domain and by combining existing positioning methods with ours.

Chaotic and power law states in the Portevin-Le Chatelier effect

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Recent studies on the Portevin-Le Chatelier effect report an intriguing crossover phenomenon from low-dimensional chaotic to an infinite-dimensional scale-invariant power law regime in experiments on CuAl single crystals and AlMg polycrystals, as function of strain rate. We devise fully dynamical model which reproduces these results. At low and medium strain rates, the model is chaotic with the structure of the attractor resembling the reconstructed experimental attractor. At high strain rates, power law statistics for the magnitudes and durations of the stress drops emerge as in experiments and concomitantly, the largest Lyapunov exponent is zero.

Efficient method of moving shadow detection and vehicle classification

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Moving shadow detection and removal from the extracted foreground regions of video frames, aim to limit the risk of misconsideration of moving shadows as a part of moving objects. This operation thus enhances the rate of accuracy in detection and classification of moving objects. With a similar reasoning, the present paper proposes an efficient method for the discrimination of moving object and moving shadow regions in a video sequence, with no human intervention. Also, it requires less computational burden and works effectively under dynamic traffic road conditions on highways (with and without marking lines), street ways (with and without marking lines). Further, we have used scale-invariant feature transform-based features for the classification of moving vehicles (with and without shadow regions), which enhances the effectiveness of the proposed method. The potentiality of the method is tested with various data sets collected from different road traffic scenarios, and its superiority is compared with the existing methods. (C) 2013 Elsevier GmbH. All rights reserved.

Thermodynamical versus log-Poisson distribution in turbulence

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The thermodynamical model of intermittency in fully developed turbulence due to Castaing (B. Castaing, J. Phys. II France 6 (1996) 105) is investigated and compared with the log-Poisson model (Z-S, She, E. Leveque, Phys. Rev. Lett. 72 (1994) 336). It is shown that the thermodynamical model obeys general scaling laws and corresponds to the degenerate class of scale-invariant statistics. We also find that its structure function shapes have physical behaviors similar to the log-Poisson's one. The only difference between them lies in the convergence of the log-Poisson's structure functions and divergence of the thermodynamical one. As far as the comparison with experiments on intermittency is concerned, they are indifferent.

Vision-based global localization using a visual vocabular

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper presents a novel coarse-to-fine global localization approach that is inspired by object recognition and text retrieval techniques. Harris-Laplace interest points characterized by SIFT descriptors are used as natural land-marks. These descriptors are indexed into two databases: an inverted index and a location database. The inverted index is built based on a visual vocabulary learned from the feature descriptors. In the location database, each location is directly represented by a set of scale invariant descriptors. The localization process consists of two stages: coarse localization and fine localization. Coarse localization from the inverted index is fast but not accurate enough; whereas localization from the location database using voting algorithm is relatively slow but more accurate. The combination of coarse and fine stages makes fast and reliable localization possible. In addition, if necessary, the localization result can be verified by epipolar geometry between the representative view in database and the view to be localized. Experimental results show that our approach is efficient and reliable. ©2005 IEEE.

«
1
2
3
4
5
6
7
8
...
65
66
»