885 resultados para SIFT,Computer Vision,Python,Object Recognition,Feature Detection,Descriptor Computation


Relevância:

100.00% 100.00%

Publicador:

Resumo:

A new algorithm for extracting features from images for object recognition is described. The algorithm uses higher order spectra to provide desirable invariance properties, to provide noise immunity, and to incorporate nonlinearity into the feature extraction procedure thereby allowing the use of simple classifiers. An image can be reduced to a set of 1D functions via the Radon transform, or alternatively, the Fourier transform of each 1D projection can be obtained from a radial slice of the 2D Fourier transform of the image according to the Fourier slice theorem. A triple product of Fourier coefficients, referred to as the deterministic bispectrum, is computed for each 1D function and is integrated along radial lines in bifrequency space. Phases of the integrated bispectra are shown to be translation- and scale-invariant. Rotation invariance is achieved by a regrouping of these invariants at a constant radius followed by a second stage of invariant extraction. Rotation invariance is thus converted to translation invariance in the second step. Results using synthetic and actual images show that isolated, compact clusters are formed in feature space. These clusters are linearly separable, indicating that the nonlinearity required in the mapping from the input space to the classification space is incorporated well into the feature extraction stage. The use of higher order spectra results in good noise immunity, as verified with synthetic and real images. Classification of images using the higher order spectra-based algorithm compares favorably to classification using the method of moment invariants

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we propose a new method for face recognition using fractal codes. Fractal codes represent local contractive, affine transformations which when iteratively applied to range-domain pairs in an arbitrary initial image result in a fixed point close to a given image. The transformation parameters such as brightness offset, contrast factor, orientation and the address of the corresponding domain for each range are used directly as features in our method. Features of an unknown face image are compared with those pre-computed for images in a database. There is no need to iterate, use fractal neighbor distances or fractal dimensions for comparison in the proposed method. This method is robust to scale change, frame size change and rotations as well as to some noise, facial expressions and blur distortion in the image

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Gait energy images (GEIs) and its variants form the basis of many recent appearance-based gait recognition systems. The GEI combines good recognition performance with a simple implementation, though it suffers problems inherent to appearance-based approaches, such as being highly view dependent. In this paper, we extend the concept of the GEI to 3D, to create what we call the gait energy volume, or GEV. A basic GEV implementation is tested on the CMU MoBo database, showing improvements over both the GEI baseline and a fused multi-view GEI approach. We also demonstrate the efficacy of this approach on partial volume reconstructions created from frontal depth images, which can be more practically acquired, for example, in biometric portals implemented with stereo cameras, or other depth acquisition systems. Experiments on frontal depth images are evaluated on an in-house developed database captured using the Microsoft Kinect, and demonstrate the validity of the proposed approach.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A new approach to pattern recognition using invariant parameters based on higher order spectra is presented. In particular, invariant parameters derived from the bispectrum are used to classify one-dimensional shapes. The bispectrum, which is translation invariant, is integrated along straight lines passing through the origin in bifrequency space. The phase of the integrated bispectrum is shown to be scale and amplification invariant, as well. A minimal set of these invariants is selected as the feature vector for pattern classification, and a minimum distance classifier using a statistical distance measure is used to classify test patterns. The classification technique is shown to distinguish two similar, but different bolts given their one-dimensional profiles. Pattern recognition using higher order spectral invariants is fast, suited for parallel implementation, and has high immunity to additive Gaussian noise. Simulation results show very high classification accuracy, even for low signal-to-noise ratios.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article presents a visual servoing system to follow a 3D moving object by a Micro Unmanned Aerial Vehicle (MUAV). The presented control strategy is based only on the visual information given by an adaptive tracking method based on the colour information. A visual fuzzy system has been developed for servoing the camera situated on a rotary wing MAUV, that also considers its own dynamics. This system is focused on continuously following of an aerial moving target object, maintaining it with a fixed safe distance and centred on the image plane. The algorithm is validated on real flights on outdoors scenarios, showing the robustness of the proposed systems against winds perturbations, illumination and weather changes among others. The obtained results indicate that the proposed algorithms is suitable for complex controls task, such object following and pursuit, flying in formation, as well as their use for indoor navigation

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The conventional manual power line corridor inspection processes that are used by most energy utilities are labor-intensive, time consuming and expensive. Remote sensing technologies represent an attractive and cost-effective alternative approach to these monitoring activities. This paper presents a comprehensive investigation into automated remote sensing based power line corridor monitoring, focusing on recent innovations in the area of increased automation of fixed-wing platforms for aerial data collection, and automated data processing for object recognition using a feature fusion process. Airborne automation is achieved by using a novel approach that provides improved lateral control for tracking corridors and automatic real-time dynamic turning for flying between corridor segments, we call this approach PTAGS. Improved object recognition is achieved by fusing information from multi-sensor (LiDAR and imagery) data and multiple visual feature descriptors (color and texture). The results from our experiments and field survey illustrate the effectiveness of the proposed aircraft control and feature fusion approaches.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Thermal-infrared images have superior statistical properties compared with visible-spectrum images in many low-light or no-light scenarios. However, a detailed understanding of feature detector performance in the thermal modality lags behind that of the visible modality. To address this, the first comprehensive study on feature detector performance on thermal-infrared images is conducted. A dataset is presented which explores a total of ten different environments with a range of statistical properties. An investigation is conducted into the effects of several digital and physical image transformations on detector repeatability in these environments. The effect of non-uniformity noise, unique to the thermal modality, is analyzed. The accumulation of sensor non-uniformities beyond the minimum possible level was found to have only a small negative effect. A limiting of feature counts was found to improve the repeatability performance of several detectors. Most other image transformations had predictable effects on feature stability. The best-performing detector varied considerably depending on the nature of the scene and the test.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We propose an approach to employ eigen light-fields for face recognition across pose on video. Faces of a subject are collected from video frames and combined based on the pose to obtain a set of probe light-fields. These probe data are then projected to the principal subspace of the eigen light-fields within which the classification takes place. We modify the original light-field projection and found that it is more robust in the proposed system. Evaluation on VidTIMIT dataset has demonstrated that the eigen light-fields method is able to take advantage of multiple observations contained in the video.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Toolbox, combined with MATLAB ® and a modern workstation computer, is a useful and convenient environment for investigation of machine vision algorithms. For modest image sizes the processing rate can be sufficiently ``real-time'' to allow for closed-loop control. Focus of attention methods such as dynamic windowing (not provided) can be used to increase the processing rate. With input from a firewire or web camera (support provided) and output to a robot (not provided) it would be possible to implement a visual servo system entirely in MATLAB. Provides many functions that are useful in machine vision and vision-based control. Useful for photometry, photogrammetry, colorimetry. It includes over 100 functions spanning operations such as image file reading and writing, acquisition, display, filtering, blob, point and line feature extraction, mathematical morphology, homographies, visual Jacobians, camera calibration and color space conversion.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In a commercial environment, it is advantageous to know how long it takes customers to move between different regions, how long they spend in each region, and where they are likely to go as they move from one location to another. Presently, these measures can only be determined manually, or through the use of hardware tags (i.e. RFID). Soft biometrics are characteristics that can be used to describe, but not uniquely identify an individual. They include traits such as height, weight, gender, hair, skin and clothing colour. Unlike traditional biometrics, soft biometrics can be acquired by surveillance cameras at range without any user cooperation. While these traits cannot provide robust authentication, they can be used to provide identification at long range, and aid in object tracking and detection in disjoint camera networks. In this chapter we propose using colour, height and luggage soft biometrics to determine operational statistics relating to how people move through a space. A novel average soft biometric is used to locate people who look distinct, and these people are then detected at various locations within a disjoint camera network to gradually obtain operational statistics

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Thermal-infrared imagery is relatively robust to many of the failure conditions of visual and laser-based SLAM systems, such as fog, dust and smoke. The ability to use thermal-infrared video for localization is therefore highly appealing for many applications. However, operating in thermal-infrared is beyond the capacity of existing SLAM implementations. This paper presents the first known monocular SLAM system designed and tested for hand-held use in the thermal-infrared modality. The implementation includes a flexible feature detection layer able to achieve robust feature tracking in high-noise, low-texture thermal images. A novel approach for structure initialization is also presented. The system is robust to irregular motion and capable of handling the unique mechanical shutter interruptions common to thermal-infrared cameras. The evaluation demonstrates promising performance of the algorithm in several environments.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The increasingly widespread use of large-scale 3D virtual environments has translated into an increasing effort required from designers, developers and testers. While considerable research has been conducted into assisting the design of virtual world content and mechanics, to date, only limited contributions have been made regarding the automatic testing of the underpinning graphics software and hardware. In the work presented in this paper, two novel neural network-based approaches are presented to predict the correct visualization of 3D content. Multilayer perceptrons and self-organizing maps are trained to learn the normal geometric and color appearance of objects from validated frames and then used to detect novel or anomalous renderings in new images. Our approach is general, for the appearance of the object is learned rather than explicitly represented. Experiments were conducted on a game engine to determine the applicability and effectiveness of our algorithms. The results show that the neural network technology can be effectively used to address the problem of automatic and reliable visual testing of 3D virtual environments.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we present a fast power line detection and localisation algorithm as well as propose a high-level guidance architecture for active vision-based Unmanned Aerial Vehicle (UAV) guidance. The detection stage is based on steerable filters for edge ridge detection, followed by a line fitting algorithm to refine candidate power lines in images. The guidance architecture assumes an UAV with an onboard Gimbal camera. We first control the position of the Gimbal such that the power line is in the field of view of the camera. Then its pose is used to generate the appropriate control commands such that the aircraft moves and flies above the lines. We present initial experimental results for the detection stage which shows that the proposed algorithm outperforms two state-of-the-art line detection algorithms for power line detection from aerial imagery.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The increasing popularity of video consumption from mobile devices requires an effective video coding strategy. To overcome diverse communication networks, video services often need to maintain sustainable quality when the available bandwidth is limited. One of the strategy for a visually-optimised video adaptation is by implementing a region-of-interest (ROI) based scalability, whereby important regions can be encoded at a higher quality while maintaining sufficient quality for the rest of the frame. The result is an improved perceived quality at the same bit rate as normal encoding, which is particularly obvious at the range of lower bit rate. However, because of the difficulties of predicting region-of-interest (ROI) accurately, there is a limited research and development of ROI-based video coding for general videos. In this paper, the phase spectrum quaternion of Fourier Transform (PQFT) method is adopted to determine the ROI. To improve the results of ROI detection, the saliency map from the PQFT is augmented with maps created from high level knowledge of factors that are known to attract human attention. Hence, maps that locate faces and emphasise the centre of the screen are used in combination with the saliency map to determine the ROI. The contribution of this paper lies on the automatic ROI detection technique for coding a low bit rate videos which include the ROI prioritisation technique to give different level of encoding qualities for multiple ROIs, and the evaluation of the proposed automatic ROI detection that is shown to have a close performance to human ROI, based on the eye fixation data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Modelling video sequences by subspaces has recently shown promise for recognising human actions. Subspaces are able to accommodate the effects of various image variations and can capture the dynamic properties of actions. Subspaces form a non-Euclidean and curved Riemannian manifold known as a Grassmann manifold. Inference on manifold spaces usually is achieved by embedding the manifolds in higher dimensional Euclidean spaces. In this paper, we instead propose to embed the Grassmann manifolds into reproducing kernel Hilbert spaces and then tackle the problem of discriminant analysis on such manifolds. To achieve efficient machinery, we propose graph-based local discriminant analysis that utilises within-class and between-class similarity graphs to characterise intra-class compactness and inter-class separability, respectively. Experiments on KTH, UCF Sports, and Ballet datasets show that the proposed approach obtains marked improvements in discrimination accuracy in comparison to several state-of-the-art methods, such as the kernel version of affine hull image-set distance, tensor canonical correlation analysis, spatial-temporal words and hierarchy of discriminative space-time neighbourhood features.