991 resultados para computer vision


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Camera calibration information is required in order for multiple camera networks to deliver more than the sum of many single camera systems. Methods exist for manually calibrating cameras with high accuracy. Manually calibrating networks with many cameras is, however, time consuming, expensive and impractical for networks that undergo frequent change. For this reason, automatic calibration techniques have been vigorously researched in recent years. Fully automatic calibration methods depend on the ability to automatically find point correspondences between overlapping views. In typical camera networks, cameras are placed far apart to maximise coverage. This is referred to as a wide base-line scenario. Finding sufficient correspondences for camera calibration in wide base-line scenarios presents a significant challenge. This thesis focuses on developing more effective and efficient techniques for finding correspondences in uncalibrated, wide baseline, multiple-camera scenarios. The project consists of two major areas of work. The first is the development of more effective and efficient view covariant local feature extractors. The second area involves finding methods to extract scene information using the information contained in a limited set of matched affine features. Several novel affine adaptation techniques for salient features have been developed. A method is presented for efficiently computing the discrete scale space primal sketch of local image features. A scale selection method was implemented that makes use of the primal sketch. The primal sketch-based scale selection method has several advantages over the existing methods. It allows greater freedom in how the scale space is sampled, enables more accurate scale selection, is more effective at combining different functions for spatial position and scale selection, and leads to greater computational efficiency. Existing affine adaptation methods make use of the second moment matrix to estimate the local affine shape of local image features. In this thesis, it is shown that the Hessian matrix can be used in a similar way to estimate local feature shape. The Hessian matrix is effective for estimating the shape of blob-like structures, but is less effective for corner structures. It is simpler to compute than the second moment matrix, leading to a significant reduction in computational cost. A wide baseline dense correspondence extraction system, called WiDense, is presented in this thesis. It allows the extraction of large numbers of additional accurate correspondences, given only a few initial putative correspondences. It consists of the following algorithms: An affine region alignment algorithm that ensures accurate alignment between matched features; A method for extracting more matches in the vicinity of a matched pair of affine features, using the alignment information contained in the match; An algorithm for extracting large numbers of highly accurate point correspondences from an aligned pair of feature regions. Experiments show that the correspondences generated by the WiDense system improves the success rate of computing the epipolar geometry of very widely separated views. This new method is successful in many cases where the features produced by the best wide baseline matching algorithms are insufficient for computing the scene geometry.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Uncooperative iris identification systems at a distance and on the move often suffer from poor resolution and poor focus of the captured iris images. The lack of pixel resolution and well-focused images significantly degrades the iris recognition performance. This paper proposes a new approach to incorporate the focus score into a reconstruction-based super-resolution process to generate a high resolution iris image from a low resolution and focus inconsistent video sequence of an eye. A reconstruction-based technique, which can incorporate middle and high frequency components from multiple low resolution frames into one desired super-resolved frame without introducing false high frequency components, is used. A new focus assessment approach is proposed for uncooperative iris at a distance and on the move to improve performance for variations in lighting, size and occlusion. A novel fusion scheme is then proposed to incorporate the proposed focus score into the super-resolution process. The experiments conducted on the The Multiple Biometric Grand Challenge portal database shows that our proposed approach achieves an EER of 2.1%, outperforming the existing state-of-the-art averaging signal-level fusion approach by 19.2% and the robust mean super-resolution approach by 8.7%.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Wireless Multi-media Sensor Networks (WMSNs) have become increasingly popular in recent years, driven in part by the increasing commoditization of small, low-cost CMOS sensors. As such, the challenge of automatically calibrating these types of cameras nodes has become an important research problem, especially for the case when a large quantity of these type of devices are deployed. This paper presents a method for automatically calibrating a wireless camera node with the ability to rotate around one axis. The method involves capturing images as the camera is rotated and computing the homographies between the images. The camera parameters, including focal length, principal point and the angle and axis of rotation can then recovered from two or more homographies. The homography computation algorithm is designed to deal with the limited resources of the wireless sensor and to minimize energy con- sumption. In this paper, a modified RANdom SAmple Consensus (RANSAC) algorithm is proposed to effectively increase the efficiency and reliability of the calibration procedure.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Within a surveillance video, occlusions are commonplace, and accurately resolving these occlusions is key when seeking to accurately track objects. The challenge of accurately segmenting objects is further complicated by the fact that within many real-world surveillance environments, the objects appear very similar. For example, footage of pedestrians in a city environment will consist of many people wearing dark suits. In this paper, we propose a novel technique to segment groups and resolve occlusions using optical flow discontinuities. We demonstrate that the ratio of continuous to discontinuous pixels within a region can be used to locate the overlapping edges, and incorporate this into an object tracking framework. Results on a portion of the ETISEO database show that the proposed algorithm results in improved tracking performance overall, and improved tracking within occlusions.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We present a novel method for integrating GPS position estimates with position and attitude estimates derived from visual odometry using a scheme similar to a classic loosely-coupled GPS/INS integration. Under such an arrangement, we derive the error dynamics of the system and develop a Kalman Filter for estimating the errors in position and attitude. Using a control-based approach to observability, we show that the errors in both position and attitude (including yaw) are fully observable when there is a component of acceleration perpendicular to the velocity vector in the navigation frame. Numerical simulations are performed to confirm the observability analysis.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper, a method has been developed for estimating pitch angle, roll angle and aircraft body rates based on horizon detection and temporal tracking using a forward-looking camera, without assistance from other sensors. Using an image processing front-end, we select several lines in an image that may or may not correspond to the true horizon. The optical flow at each candidate line is calculated, which may be used to measure the body rates of the aircraft. Using an Extended Kalman Filter (EKF), the aircraft state is propagated using a motion model and a candidate horizon line is associated using a statistical test based on the optical flow measurements and the location of the horizon. Once associated, the selected horizon line, along with the associated optical flow, is used as a measurement to the EKF. To test the accuracy of the algorithm, two flights were conducted, one using a highly dynamic Uninhabited Airborne Vehicle (UAV) in clear flight conditions and the other in a human-piloted Cessna 172 in conditions where the horizon was partially obscured by terrain, haze and smoke. The UAV flight resulted in pitch and roll error standard deviations of 0.42◦ and 0.71◦ respectively when compared with a truth attitude source. The Cessna flight resulted in pitch and roll error standard deviations of 1.79◦ and 1.75◦ respectively. The benefits of selecting and tracking the horizon using a motion model and optical flow rather than naively relying on the image processing front-end is also demonstrated.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We consider the problem of object tracking in a wireless multimedia sensor network (we mainly focus on the camera component in this work). The vast majority of current object tracking techniques, either centralised or distributed, assume unlimited energy, meaning these techniques don't translate well when applied within the constraints of low-power distributed systems. In this paper we develop and analyse a highly-scalable, distributed strategy to object tracking in wireless camera networks with limited resources. In the proposed system, cameras transmit descriptions of objects to a subset of neighbours, determined using a predictive forwarding strategy. The received descriptions are then matched at the next camera on the objects path using a probability maximisation process with locally generated descriptions. We show, via simulation, that our predictive forwarding and probabilistic matching strategy can significantly reduce the number of object-misses, ID-switches and ID-losses; it can also reduce the number of required transmissions over a simple broadcast scenario by up to 67%. We show that our system performs well under realistic assumptions about matching objects appearance using colour.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In automatic facial expression detection, very accurate registration is desired which can be achieved via a deformable model approach where a dense mesh of 60-70 points on the face is used, such as an active appearance model (AAM). However, for applications where manually labeling frames is prohibitive, AAMs do not work well as they do not generalize well to unseen subjects. As such, a more coarse approach is taken for person-independent facial expression detection, where just a couple of key features (such as face and eyes) are tracked using a Viola-Jones type approach. The tracked image is normally post-processed to encode for shift and illumination invariance using a linear bank of filters. Recently, it was shown that this preprocessing step is of no benefit when close to ideal registration has been obtained. In this paper, we present a system based on the Constrained Local Model (CLM) which is a generic or person-independent face alignment algorithm which gains high accuracy. We show these results against the LBP feature extraction on the CK+ and GEMEP datasets.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Segmentation of novel or dynamic objects in a scene, often referred to as background sub- traction or foreground segmentation, is critical for robust high level computer vision applica- tions such as object tracking, object classifca- tion and recognition. However, automatic real- time segmentation for robotics still poses chal- lenges including global illumination changes, shadows, inter-re ections, colour similarity of foreground to background, and cluttered back- grounds. This paper introduces depth cues provided by structure from motion (SFM) for interactive segmentation to alleviate some of these challenges. In this paper, two prevailing interactive segmentation algorithms are com- pared; Lazysnapping [Li et al., 2004] and Grab- cut [Rother et al., 2004], both based on graph- cut optimisation [Boykov and Jolly, 2001]. The algorithms are extended to include depth cues rather than colour only as in the original pa- pers. Results show interactive segmentation based on colour and depth cues enhances the performance of segmentation with a lower er- ror with respect to ground truth.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper we present a real-time foreground–background segmentation algorithm that exploits the following observation (very often satisfied by a static camera positioned high in its environment). If a blob moves on a pixel p that had not changed its colour significantly for a few frames, then p was probably part of the background when its colour was static. With this information we are able to update differentially pixels believed to be background. This work is relevant to autonomous minirobots, as they often navigate in buildings where smart surveillance cameras could communicate wirelessly with them. A by-product of the proposed system is a mask of the image regions which are demonstrably background. Statistically significant tests show that the proposed method has a better precision and recall rates than the state of the art foreground/background segmentation algorithm of the OpenCV computer vision library.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

It is possible for the visual attention characteristics of a person to be exploited as a biometric for authentication or identification of individual viewers. The visual attention characteristics of a person can be easily monitored by tracking the gaze of a viewer during the presentation of a known or unknown visual scene. The positions and sequences of gaze locations during viewing may be determined by overt (conscious) or covert (sub-conscious) viewing behaviour. This paper presents a method to authenticate individuals using their covert viewing behaviour, thus yielding a unique behavioural biometric. A method to quantify the spatial and temporal patterns established by the viewer for their covert behaviour is proposed utilsing a principal component analysis technique called `eigenGaze'. Experimental results suggest that it is possible to capture the unique visual attention characteristics of a person to provide a simple behavioural biometric.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In a clinical setting, pain is reported either through patient self-report or via an observer. Such measures are problematic as they are: 1) subjective, and 2) give no specific timing information. Coding pain as a series of facial action units (AUs) can avoid these issues as it can be used to gain an objective measure of pain on a frame-by-frame basis. Using video data from patients with shoulder injuries, in this paper, we describe an active appearance model (AAM)-based system that can automatically detect the frames in video in which a patient is in pain. This pain data set highlights the many challenges associated with spontaneous emotion detection, particularly that of expression and head movement due to the patient's reaction to pain. In this paper, we show that the AAM can deal with these movements and can achieve significant improvements in both the AU and pain detection performance compared to the current-state-of-the-art approaches which utilize similarity-normalized appearance features only.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Detection of Region of Interest (ROI) in a video leads to more efficient utilization of bandwidth. This is because any ROIs in a given frame can be encoded in higher quality than the rest of that frame, with little or no degradation of quality from the perception of the viewers. Consequently, it is not necessary to uniformly encode the whole video in high quality. One approach to determine ROIs is to use saliency detectors to locate salient regions. This paper proposes a methodology for obtaining ground truth saliency maps to measure the effectiveness of ROI detection by considering the role of user experience during the labelling process of such maps. User perceptions can be captured and incorporated into the definition of salience in a particular video, taking advantage of human visual recall within a given context. Experiments with two state-of-the-art saliency detectors validate the effectiveness of this approach to validating visual saliency in video. This paper will provide the relevant datasets associated with the experiments.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Less cooperative iris identification systems at a distance and on the move often suffers from poor resolution. The lack of pixel resolution significantly degrades the iris recognition performance. Super-resolution has been considered to enhance resolution of iris images. This paper proposes a pixelwise super-resolution technique to reconstruct a high resolution iris image from a video sequence of an eye. A novel fusion approach is proposed to incorporate information details from multiple frames using robust mean. Experiments on the MBGC NIR portal database show the validity of the proposed approach in comparison with other resolution enhancement techniques.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Eigen-based techniques and other monolithic approaches to face recognition have long been a cornerstone in the face recognition community due to the high dimensionality of face images. Eigen-face techniques provide minimal reconstruction error and limit high-frequency content while linear discriminant-based techniques (fisher-faces) allow the construction of subspaces which preserve discriminatory information. This paper presents a frequency decomposition approach for improved face recognition performance utilising three well-known techniques: Wavelets; Gabor / Log-Gabor; and the Discrete Cosine Transform. Experimentation illustrates that frequency domain partitioning prior to dimensionality reduction increases the information available for classification and greatly increases face recognition performance for both eigen-face and fisher-face approaches.