884 resultados para computer vision, facial expression recognition, swig, red5, actionscript, ruby on rails, html5
Resumo:
In a commercial environment, it is advantageous to know how long it takes customers to move between different regions, how long they spend in each region, and where they are likely to go as they move from one location to another. Presently, these measures can only be determined manually, or through the use of hardware tags (i.e. RFID). Soft biometrics are characteristics that can be used to describe, but not uniquely identify an individual. They include traits such as height, weight, gender, hair, skin and clothing colour. Unlike traditional biometrics, soft biometrics can be acquired by surveillance cameras at range without any user cooperation. While these traits cannot provide robust authentication, they can be used to provide identification at long range, and aid in object tracking and detection in disjoint camera networks. In this chapter we propose using colour, height and luggage soft biometrics to determine operational statistics relating to how people move through a space. A novel average soft biometric is used to locate people who look distinct, and these people are then detected at various locations within a disjoint camera network to gradually obtain operational statistics
Resumo:
Learning and then recognizing a route, whether travelled during the day or at night, in clear or inclement weather, and in summer or winter is a challenging task for state of the art algorithms in computer vision and robotics. In this paper, we present a new approach to visual navigation under changing conditions dubbed SeqSLAM. Instead of calculating the single location most likely given a current image, our approach calculates the best candidate matching location within every local navigation sequence. Localization is then achieved by recognizing coherent sequences of these “local best matches”. This approach removes the need for global matching performance by the vision front-end - instead it must only pick the best match within any short sequence of images. The approach is applicable over environment changes that render traditional feature-based techniques ineffective. Using two car-mounted camera datasets we demonstrate the effectiveness of the algorithm and compare it to one of the most successful feature-based SLAM algorithms, FAB-MAP. The perceptual change in the datasets is extreme; repeated traverses through environments during the day and then in the middle of the night, at times separated by months or years and in opposite seasons, and in clear weather and extremely heavy rain. While the feature-based method fails, the sequence-based algorithm is able to match trajectory segments at 100% precision with recall rates of up to 60%.
Resumo:
This paper presents an approach for the automatic calibration of low-cost cameras which are assumed to be restricted in their freedom of movement to either pan or tilt movements. Camera parameters, including focal length, principal point, lens distortion parameter and the angle and axis of rotation, can be recovered from a minimum set of two images of the camera, provided that the axis of rotation between the two images goes through the camera’s optical center and is parallel to either the vertical (panning) or horizontal (tilting) axis of the image. Previous methods for auto-calibration of cameras based on pure rotations fail to work in these two degenerate cases. In addition, our approach includes a modified RANdom SAmple Consensus (RANSAC) algorithm, as well as improved integration of the radial distortion coefficient in the computation of inter-image homographies. We show that these modifications are able to increase the overall efficiency, reliability and accuracy of the homography computation and calibration procedure using both synthetic and real image sequences
Resumo:
From a law enforcement standpoint, the ability to search for a person matching a semantic description (i.e. 1.8m tall, red shirt, jeans) is highly desirable. While a significant research effort has focused on person re-detection (the task of identifying a previously observed individual in surveillance video), these techniques require descriptors to be built from existing image or video observations. As such, person re-detection techniques are not suited to situations where footage of the person of interest is not readily available, such as a witness reporting a recent crime. In this paper, we present a novel framework that is able to search for a person based on a semantic description. The proposed approach uses size and colour cues, and does not require a person detection routine to locate people in the scene, improving utility in crowded conditions. The proposed approach is demonstrated with a new database that will be made available to the research community, and we show that the proposed technique is able to correctly localise a person in a video based on a simple semantic description.
Resumo:
Thermal-infrared imagery is relatively robust to many of the failure conditions of visual and laser-based SLAM systems, such as fog, dust and smoke. The ability to use thermal-infrared video for localization is therefore highly appealing for many applications. However, operating in thermal-infrared is beyond the capacity of existing SLAM implementations. This paper presents the first known monocular SLAM system designed and tested for hand-held use in the thermal-infrared modality. The implementation includes a flexible feature detection layer able to achieve robust feature tracking in high-noise, low-texture thermal images. A novel approach for structure initialization is also presented. The system is robust to irregular motion and capable of handling the unique mechanical shutter interruptions common to thermal-infrared cameras. The evaluation demonstrates promising performance of the algorithm in several environments.
Resumo:
In this paper we present a fast power line detection and localisation algorithm as well as propose a high-level guidance architecture for active vision-based Unmanned Aerial Vehicle (UAV) guidance. The detection stage is based on steerable filters for edge ridge detection, followed by a line fitting algorithm to refine candidate power lines in images. The guidance architecture assumes an UAV with an onboard Gimbal camera. We first control the position of the Gimbal such that the power line is in the field of view of the camera. Then its pose is used to generate the appropriate control commands such that the aircraft moves and flies above the lines. We present initial experimental results for the detection stage which shows that the proposed algorithm outperforms two state-of-the-art line detection algorithms for power line detection from aerial imagery.
Resumo:
The increasing popularity of video consumption from mobile devices requires an effective video coding strategy. To overcome diverse communication networks, video services often need to maintain sustainable quality when the available bandwidth is limited. One of the strategy for a visually-optimised video adaptation is by implementing a region-of-interest (ROI) based scalability, whereby important regions can be encoded at a higher quality while maintaining sufficient quality for the rest of the frame. The result is an improved perceived quality at the same bit rate as normal encoding, which is particularly obvious at the range of lower bit rate. However, because of the difficulties of predicting region-of-interest (ROI) accurately, there is a limited research and development of ROI-based video coding for general videos. In this paper, the phase spectrum quaternion of Fourier Transform (PQFT) method is adopted to determine the ROI. To improve the results of ROI detection, the saliency map from the PQFT is augmented with maps created from high level knowledge of factors that are known to attract human attention. Hence, maps that locate faces and emphasise the centre of the screen are used in combination with the saliency map to determine the ROI. The contribution of this paper lies on the automatic ROI detection technique for coding a low bit rate videos which include the ROI prioritisation technique to give different level of encoding qualities for multiple ROIs, and the evaluation of the proposed automatic ROI detection that is shown to have a close performance to human ROI, based on the eye fixation data.
Resumo:
This paper presents a reactive Sense and Avoid approach using spherical image-based visual servoing. Avoidance of point targets in the lateral or vertical plane is achieved without requiring an estimate of range. Simulated results for static and dynamic targets are provided using a realistic model of a small fixed wing unmanned aircraft.
Rotorcraft collision avoidance using spherical image-based visual servoing and single point features
Resumo:
This paper presents a reactive collision avoidance method for small unmanned rotorcraft using spherical image-based visual servoing. Only a single point feature is used to guide the aircraft in a safe spiral like trajectory around the target, whilst a spherical camera model ensures the target always remains visible. A decision strategy to stop the avoidance control is derived based on the properties of spiral like motion, and the effect of accurate range measurements on the control scheme is discussed. We show that using a poor range estimate does not significantly degrade the collision avoidance performance, thus relaxing the need for accurate range measurements. We present simulated and experimental results using a small quad rotor to validate the approach.
Resumo:
Treatment plans for conformal radiotherapy are based on an initial CT scan. The aim is to deliver the prescribed dose to the tumour, while minimising exposure to nearby organs. Recent advances make it possible to also obtain a Cone-Beam CT (CBCT) scan, once the patient has been positioned for treatment. A statistical model will be developed to compare these CBCT scans with the initial CT scan. Changes in the size, shape and position of the tumour and organs will be detected and quantified. Some progress has already been made in segmentation of prostate CBCT scans [1],[2],[3]. However, none of the existing approaches have taken full advantage of the prior information that is available. The planning CT scan is expertly annotated with contours of the tumour and nearby sensitive objects. This data is specific to the individual patient and can be viewed as a snapshot of spatial information at a point in time. There is an abundance of studies in the radiotherapy literature that describe the amount of variation in the relevant organs between treatments. The findings from these studies can form a basis for estimating the degree of uncertainty. All of this information can be incorporated as an informative prior into a Bayesian statistical model. This model will be developed using scans of CT phantoms, which are objects with known geometry. Thus, the accuracy of the model can be evaluated objectively. This will also enable comparison between alternative models.
Resumo:
The rank transform is one non-parametric transform which has been applied to the stereo matching problem The advantages of this transform include its invariance to radio metric distortion and its amenability to hardware implementation. This paper describes the derivation of the rank constraint for matching using the rank transform Previous work has shown that this constraint was capable of resolving ambiguous matches thereby improving match reliability A new matching algorithm incorporating this constraint was also proposed. This paper extends on this previous work by proposing a matching algorithm which uses a dimensional match surface in which the match score is computed for every possible template and match window combination. The principal advantage of this algorithm is that the use of the match surface enforces the left�right consistency and uniqueness constraints thus improving the algorithms ability to remove invalid matches Experimental results for a number of test stereo pairs show that the new algorithm is capable of identifying and removing a large number of in incorrect matches particularly in the case of occlusions
Resumo:
A fundamental problem faced by stereo vision algorithms is that of determining correspondences between two images which comprise a stereo pair. This paper presents work towards the development of a new matching algorithm, based on the rank transform. This algorithm makes use of both area-based and edge-based information, and is therefore referred to as a hybrid algorithm. In addition, this algorithm uses a number of matching constraints,including the novel rank constraint. Results obtained using a number of test pairs show that the matching algorithm is capable of removing a significant proportion of invalid matches. The accuracy of matching in the vicinity of edges is also improved.
Resumo:
A fundamental problem faced by stereo vision algorithms is that of determining correspondences between two images which comprise a stereo pair. This paper presents work towards the development of a new matching algorithm, based on the rank transform. This algorithm makes use of both area-based and edge-based information, and is therefore referred to as a hybrid algorithm. In addition, this algorithm uses a number of matching constraints, including the novel rank constraint. Results obtained using a number of test pairs show that the matching algorithm is capable of removing most invalid matches. The accuracy of matching in the vicinity of edges is also improved.
Resumo:
This paper outlines existing matching diagnostics, which may be used for identifying invalid matches and estimating the probability of a correct match. In addition, it proposes a new diagnostic for error prediction which can be used with the rank and census transforms. Both the existing and the new diagnostics have been evaluated and compared for a number of test images. In each case, a confidence estimate was computed for every location of the disparity map, and disparities having a low confidence estimate removed from the disparity map. Collectively, these confidence estimates may be termed a confidence map. Such information would be useful for potential applications of stereo vision such as automation and navigation.
Resumo:
The mining environment, being complex, irregular, and time-varying, presents a challenging prospect for stereo vision. For this application, speed, reliability, and the ability to produce a dense depth map are of foremost importance. This paper evaluates a number of matching techniques for possible use in a stereo vision sensor for mining automation applications. Area-based techniques have been investigated because they have the potential to yield dense maps, are amenable to fast hardware implementation, and are suited to textured scenes. In addition, two nonparametric transforms, namely, rank and census, have been investigated. Matching algorithms using these transforms were found to have a number of clear advantages, including reliability in the presence of radiometric distortion, low computational complexity, and amenability to hardware implementation.