274 resultados para SIFT,Computer Vision,Python,Object Recognition,Feature Detection,Descriptor Computation


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Texture information in the iris image is not uniform in discriminatory information content for biometric identity verification. The bits in an iris code obtained from the image differ in their consistency from one sample to another for the same identity. In this work, errors in bit strings are systematically analysed in order to investigate the effect of light-induced and drug-induced pupil dilation and constriction on the consistency of iris texture information. The statistics of bit errors are computed for client and impostor distributions as functions of radius and angle. Under normal conditions, a V-shaped radial trend of decreasing bit errors towards the central region of the iris is obtained for client matching, and it is observed that the distribution of errors as a function of angle is uniform. When iris images are affected by pupil dilation or constriction the radial distribution of bit errors is altered. A decreasing trend from the pupil outwards is observed for constriction, whereas a more uniform trend is observed for dilation. The main increase in bit errors occurs closer to the pupil in both cases.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Public buildings and large infrastructure are typically monitored by tens or hundreds of cameras, all capturing different physical spaces and observing different types of interactions and behaviours. However to date, in large part due to limited data availability, crowd monitoring and operational surveillance research has focused on single camera scenarios which are not representative of real-world applications. In this paper we present a new, publicly available database for large scale crowd surveillance. Footage from 12 cameras for a full work day covering the main floor of a busy university campus building, including an internal and external foyer, elevator foyers, and the main external approach are provided; alongside annotation for crowd counting (single or multi-camera) and pedestrian flow analysis for 10 and 6 sites respectively. We describe how this large dataset can be used to perform distributed monitoring of building utilisation, and demonstrate the potential of this dataset to understand and learn the relationship between different areas of a building.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Diffusion weighted magnetic resonance (MR) imaging is a powerful tool that can be employed to study white matter microstructure by examining the 3D displacement profile of water molecules in brain tissue. By applying diffusion-sensitized gradients along a minimum of 6 directions, second-order tensors can be computed to model dominant diffusion processes. However, conventional DTI is not sufficient to resolve crossing fiber tracts. Recently, a number of high-angular resolution schemes with greater than 6 gradient directions have been employed to address this issue. In this paper, we introduce the Tensor Distribution Function (TDF), a probability function defined on the space of symmetric positive definite matrices. Here, fiber crossing is modeled as an ensemble of Gaussian diffusion processes with weights specified by the TDF. Once this optimal TDF is determined, the diffusion orientation distribution function (ODF) can easily be computed by analytic integration of the resulting displacement probability function.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis addresses the problem of detecting and describing the same scene points in different wide-angle images taken by the same camera at different viewpoints. This is a core competency of many vision-based localisation tasks including visual odometry and visual place recognition. Wide-angle cameras have a large field of view that can exceed a full hemisphere, and the images they produce contain severe radial distortion. When compared to traditional narrow field of view perspective cameras, more accurate estimates of camera egomotion can be found using the images obtained with wide-angle cameras. The ability to accurately estimate camera egomotion is a fundamental primitive of visual odometry, and this is one of the reasons for the increased popularity in the use of wide-angle cameras for this task. Their large field of view also enables them to capture images of the same regions in a scene taken at very different viewpoints, and this makes them suited for visual place recognition. However, the ability to estimate the camera egomotion and recognise the same scene in two different images is dependent on the ability to reliably detect and describe the same scene points, or ‘keypoints’, in the images. Most algorithms used for this purpose are designed almost exclusively for perspective images. Applying algorithms designed for perspective images directly to wide-angle images is problematic as no account is made for the image distortion. The primary contribution of this thesis is the development of two novel keypoint detectors, and a method of keypoint description, designed for wide-angle images. Both reformulate the Scale- Invariant Feature Transform (SIFT) as an image processing operation on the sphere. As the image captured by any central projection wide-angle camera can be mapped to the sphere, applying these variants to an image on the sphere enables keypoints to be detected in a manner that is invariant to image distortion. Each of the variants is required to find the scale-space representation of an image on the sphere, and they differ in the approaches they used to do this. Extensive experiments using real and synthetically generated wide-angle images are used to validate the two new keypoint detectors and the method of keypoint description. The best of these two new keypoint detectors is applied to vision based localisation tasks including visual odometry and visual place recognition using outdoor wide-angle image sequences. As part of this work, the effect of keypoint coordinate selection on the accuracy of egomotion estimates using the Direct Linear Transform (DLT) is investigated, and a simple weighting scheme is proposed which attempts to account for the uncertainty of keypoint positions during detection. A word reliability metric is also developed for use within a visual ‘bag of words’ approach to place recognition.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents visual detection and classification of light vehicles and personnel on a mine site.We capitalise on the rapid advances of ConvNet based object recognition but highlight that a naive black box approach results in a significant number of false positives. In particular, the lack of domain specific training data and the unique landscape in a mine site causes a high rate of errors. We exploit the abundance of background-only images to train a k-means classifier to complement the ConvNet. Furthermore, localisation of objects of interest and a reduction in computation is enabled through region proposals. Our system is tested on over 10km of real mine site data and we were able to detect both light vehicles and personnel. We show that the introduction of our background model can reduce the false positive rate by an order of magnitude.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The latest generation of Deep Convolutional Neural Networks (DCNN) have dramatically advanced challenging computer vision tasks, especially in object detection and object classification, achieving state-of-the-art performance in several computer vision tasks including text recognition, sign recognition, face recognition and scene understanding. The depth of these supervised networks has enabled learning deeper and hierarchical representation of features. In parallel, unsupervised deep learning such as Convolutional Deep Belief Network (CDBN) has also achieved state-of-the-art in many computer vision tasks. However, there is very limited research on jointly exploiting the strength of these two approaches. In this paper, we investigate the learning capability of both methods. We compare the output of individual layers and show that many learnt filters and outputs of the corresponding level layer are almost similar for both approaches. Stacking the DCNN on top of unsupervised layers or replacing layers in the DCNN with the corresponding learnt layers in the CDBN can improve the recognition/classification accuracy and training computational expense. We demonstrate the validity of the proposal on ImageNet dataset.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Light Detection and Ranging (LIDAR) has great potential to assist vegetation management in power line corridors by providing more accurate geometric information of the power line assets and vegetation along the corridors. However, the development of algorithms for the automatic processing of LIDAR point cloud data, in particular for feature extraction and classification of raw point cloud data, is in still in its infancy. In this paper, we take advantage of LIDAR intensity and try to classify ground and non-ground points by statistically analyzing the skewness and kurtosis of the intensity data. Moreover, the Hough transform is employed to detected power lines from the filtered object points. The experimental results show the effectiveness of our methods and indicate that better results were obtained by using LIDAR intensity data than elevation data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Surveillance networks are typically monitored by a few people, viewing several monitors displaying the camera feeds. It is then very difficult for a human operator to effectively detect events as they happen. Recently, computer vision research has begun to address ways to automatically process some of this data, to assist human operators. Object tracking, event recognition, crowd analysis and human identification at a distance are being pursued as a means to aid human operators and improve the security of areas such as transport hubs. The task of object tracking is key to the effective use of more advanced technologies. To recognize an event people and objects must be tracked. Tracking also enhances the performance of tasks such as crowd analysis or human identification. Before an object can be tracked, it must be detected. Motion segmentation techniques, widely employed in tracking systems, produce a binary image in which objects can be located. However, these techniques are prone to errors caused by shadows and lighting changes. Detection routines often fail, either due to erroneous motion caused by noise and lighting effects, or due to the detection routines being unable to split occluded regions into their component objects. Particle filters can be used as a self contained tracking system, and make it unnecessary for the task of detection to be carried out separately except for an initial (often manual) detection to initialise the filter. Particle filters use one or more extracted features to evaluate the likelihood of an object existing at a given point each frame. Such systems however do not easily allow for multiple objects to be tracked robustly, and do not explicitly maintain the identity of tracked objects. This dissertation investigates improvements to the performance of object tracking algorithms through improved motion segmentation and the use of a particle filter. A novel hybrid motion segmentation / optical flow algorithm, capable of simultaneously extracting multiple layers of foreground and optical flow in surveillance video frames is proposed. The algorithm is shown to perform well in the presence of adverse lighting conditions, and the optical flow is capable of extracting a moving object. The proposed algorithm is integrated within a tracking system and evaluated using the ETISEO (Evaluation du Traitement et de lInterpretation de Sequences vidEO - Evaluation for video understanding) database, and significant improvement in detection and tracking performance is demonstrated when compared to a baseline system. A Scalable Condensation Filter (SCF), a particle filter designed to work within an existing tracking system, is also developed. The creation and deletion of modes and maintenance of identity is handled by the underlying tracking system; and the tracking system is able to benefit from the improved performance in uncertain conditions arising from occlusion and noise provided by a particle filter. The system is evaluated using the ETISEO database. The dissertation then investigates fusion schemes for multi-spectral tracking systems. Four fusion schemes for combining a thermal and visual colour modality are evaluated using the OTCBVS (Object Tracking and Classification in and Beyond the Visible Spectrum) database. It is shown that a middle fusion scheme yields the best results and demonstrates a significant improvement in performance when compared to a system using either mode individually. Findings from the thesis contribute to improve the performance of semi-automated video processing and therefore improve security in areas under surveillance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Abandoned object detection (AOD) systems are required to run in high traffic situations, with high levels of occlusion. Systems rely on background segmentation techniques to locate abandoned objects, by detecting areas of motion that have stopped. This is often achieved by using a medium term motion detection routine to detect long term changes in the background. When AOD systems are integrated into person tracking system, this often results in two separate motion detectors being used to handle the different requirements. We propose a motion detection system that is capable of detecting medium term motion as well as regular motion. Multiple layers of medium term (static) motion can be detected and segmented. We demonstrate the performance of this motion detection system and as part of an abandoned object detection system.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Object tracking systems require accurate segmentation of the objects from the background for effective tracking. Motion segmentation or optical flow can be used to segment incoming images. Whilst optical flow allows multiple moving targets to be separated based on their individual velocities, optical flow techniques are prone to errors caused by changing lighting and occlusions, both common in a surveillance environment. Motion segmentation techniques are more robust to fluctuating lighting and occlusions, but don't provide information on the direction of the motion. In this paper we propose a combined motion segmentation/optical flow algorithm for use in object tracking. The proposed algorithm uses the motion segmentation results to inform the optical flow calculations and ensure that optical flow is only calculated in regions of motion, and improve the performance of the optical flow around the edge of moving objects. Optical flow is calculated at pixel resolution and tracking of flow vectors is employed to improve performance and detect discontinuities, which can indicate the location of overlaps between objects. The algorithm is evaluated by attempting to extract a moving target within the flow images, given expected horizontal and vertical movement (i.e. the algorithms intended use for object tracking). Results show that the proposed algorithm outperforms other widely used optical flow techniques for this surveillance application.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Identifying an individual from surveillance video is a difficult, time consuming and labour intensive process. The proposed system aims to streamline this process by filtering out unwanted scenes and enhancing an individual's face through super-resolution. An automatic face recognition system is then used to identify the subject or present the human operator with likely matches from a database. A person tracker is used to speed up the subject detection and super-resolution process by tracking moving subjects and cropping a region of interest around the subject's face to reduce the number and size of the image frames to be super-resolved respectively. In this paper, experiments have been conducted to demonstrate how the optical flow super-resolution method used improves surveillance imagery for visual inspection as well as automatic face recognition on an Eigenface and Elastic Bunch Graph Matching system. The optical flow based method has also been benchmarked against the ``hallucination'' algorithm, interpolation methods and the original low-resolution images. Results show that both super-resolution algorithms improved recognition rates significantly. Although the hallucination method resulted in slightly higher recognition rates, the optical flow method produced less artifacts and more visually correct images suitable for human consumption.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a novel technique for performing SLAM along a continuous trajectory of appearance. Derived from components of FastSLAM and FAB-MAP, the new system dubbed Continuous Appearance-based Trajectory SLAM (CAT-SLAM) augments appearancebased place recognition with particle-filter based ‘pose filtering’ within a probabilistic framework, without calculating global feature geometry or performing 3D map construction. For loop closure detection CAT-SLAM updates in constant time regardless of map size. We evaluate the effectiveness of CAT-SLAM on a 16km outdoor road network and determine its loop closure performance relative to FAB-MAP. CAT-SLAM recognizes 3 times the number of loop closures for the case where no false positives occur, demonstrating its potential use for robust loop closure detection in large environments.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Image representations derived from simplified models of the primary visual cortex (V1), such as HOG and SIFT, elicit good performance in a myriad of visual classification tasks including object recognition/detection, pedestrian detection and facial expression classification. A central question in the vision, learning and neuroscience communities regards why these architectures perform so well. In this paper, we offer a unique perspective to this question by subsuming the role of V1-inspired features directly within a linear support vector machine (SVM). We demonstrate that a specific class of such features in conjunction with a linear SVM can be reinterpreted as inducing a weighted margin on the Kronecker basis expansion of an image. This new viewpoint on the role of V1-inspired features allows us to answer fundamental questions on the uniqueness and redundancies of these features, and offer substantial improvements in terms of computational and storage efficiency.