913 resultados para Computer vision industry


Relevância:

80.00% 80.00%

Publicador:

Resumo:

The chief challenge facing persistent robotic navigation using vision sensors is the recognition of previously visited locations under different lighting and illumination conditions. The majority of successful approaches to outdoor robot navigation use active sensors such as LIDAR, but the associated weight and power draw of these systems makes them unsuitable for widespread deployment on mobile robots. In this paper we investigate methods to combine representations for visible and long-wave infrared (LWIR) thermal images with time information to combat the time-of-day-based limitations of each sensing modality. We calculate appearance-based match likelihoods using the state-of-the-art FAB-MAP [1] algorithm to analyse loop closure detection reliability across different times of day. We present preliminary results on a dataset of 10 successive traverses of a combined urban-parkland environment, recorded in 2-hour intervals from before dawn to after dusk. Improved location recognition throughout an entire day is demonstrated using the combined system compared with methods which use visible or thermal sensing alone.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper presents an approach for the automatic calibration of low-cost cameras which are assumed to be restricted in their freedom of movement to either pan or tilt movements. Camera parameters, including focal length, principal point, lens distortion parameter and the angle and axis of rotation, can be recovered from a minimum set of two images of the camera, provided that the axis of rotation between the two images goes through the camera’s optical center and is parallel to either the vertical (panning) or horizontal (tilting) axis of the image. Previous methods for auto-calibration of cameras based on pure rotations fail to work in these two degenerate cases. In addition, our approach includes a modified RANdom SAmple Consensus (RANSAC) algorithm, as well as improved integration of the radial distortion coefficient in the computation of inter-image homographies. We show that these modifications are able to increase the overall efficiency, reliability and accuracy of the homography computation and calibration procedure using both synthetic and real image sequences

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Audio-visualspeechrecognition, or the combination of visual lip-reading with traditional acoustic speechrecognition, has been previously shown to provide a considerable improvement over acoustic-only approaches in noisy environments, such as that present in an automotive cabin. The research presented in this paper will extend upon the established audio-visualspeechrecognition literature to show that further improvements in speechrecognition accuracy can be obtained when multiple frontal or near-frontal views of a speaker's face are available. A series of visualspeechrecognition experiments using a four-stream visual synchronous hidden Markov model (SHMM) are conducted on the four-camera AVICAR automotiveaudio-visualspeech database. We study the relative contribution between the side and central orientated cameras in improving visualspeechrecognition accuracy. Finally combination of the four visual streams with a single audio stream in a five-stream SHMM demonstrates a relative improvement of over 56% in word recognition accuracy when compared to the acoustic-only approach in the noisiest conditions of the AVICAR database.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

From a law enforcement standpoint, the ability to search for a person matching a semantic description (i.e. 1.8m tall, red shirt, jeans) is highly desirable. While a significant research effort has focused on person re-detection (the task of identifying a previously observed individual in surveillance video), these techniques require descriptors to be built from existing image or video observations. As such, person re-detection techniques are not suited to situations where footage of the person of interest is not readily available, such as a witness reporting a recent crime. In this paper, we present a novel framework that is able to search for a person based on a semantic description. The proposed approach uses size and colour cues, and does not require a person detection routine to locate people in the scene, improving utility in crowded conditions. The proposed approach is demonstrated with a new database that will be made available to the research community, and we show that the proposed technique is able to correctly localise a person in a video based on a simple semantic description.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Thermal-infrared imagery is relatively robust to many of the failure conditions of visual and laser-based SLAM systems, such as fog, dust and smoke. The ability to use thermal-infrared video for localization is therefore highly appealing for many applications. However, operating in thermal-infrared is beyond the capacity of existing SLAM implementations. This paper presents the first known monocular SLAM system designed and tested for hand-held use in the thermal-infrared modality. The implementation includes a flexible feature detection layer able to achieve robust feature tracking in high-noise, low-texture thermal images. A novel approach for structure initialization is also presented. The system is robust to irregular motion and capable of handling the unique mechanical shutter interruptions common to thermal-infrared cameras. The evaluation demonstrates promising performance of the algorithm in several environments.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper we present a fast power line detection and localisation algorithm as well as propose a high-level guidance architecture for active vision-based Unmanned Aerial Vehicle (UAV) guidance. The detection stage is based on steerable filters for edge ridge detection, followed by a line fitting algorithm to refine candidate power lines in images. The guidance architecture assumes an UAV with an onboard Gimbal camera. We first control the position of the Gimbal such that the power line is in the field of view of the camera. Then its pose is used to generate the appropriate control commands such that the aircraft moves and flies above the lines. We present initial experimental results for the detection stage which shows that the proposed algorithm outperforms two state-of-the-art line detection algorithms for power line detection from aerial imagery.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The increasing popularity of video consumption from mobile devices requires an effective video coding strategy. To overcome diverse communication networks, video services often need to maintain sustainable quality when the available bandwidth is limited. One of the strategy for a visually-optimised video adaptation is by implementing a region-of-interest (ROI) based scalability, whereby important regions can be encoded at a higher quality while maintaining sufficient quality for the rest of the frame. The result is an improved perceived quality at the same bit rate as normal encoding, which is particularly obvious at the range of lower bit rate. However, because of the difficulties of predicting region-of-interest (ROI) accurately, there is a limited research and development of ROI-based video coding for general videos. In this paper, the phase spectrum quaternion of Fourier Transform (PQFT) method is adopted to determine the ROI. To improve the results of ROI detection, the saliency map from the PQFT is augmented with maps created from high level knowledge of factors that are known to attract human attention. Hence, maps that locate faces and emphasise the centre of the screen are used in combination with the saliency map to determine the ROI. The contribution of this paper lies on the automatic ROI detection technique for coding a low bit rate videos which include the ROI prioritisation technique to give different level of encoding qualities for multiple ROIs, and the evaluation of the proposed automatic ROI detection that is shown to have a close performance to human ROI, based on the eye fixation data.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper presents a reactive Sense and Avoid approach using spherical image-based visual servoing. Avoidance of point targets in the lateral or vertical plane is achieved without requiring an estimate of range. Simulated results for static and dynamic targets are provided using a realistic model of a small fixed wing unmanned aircraft.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Quality based frame selection is a crucial task in video face recognition, to both improve the recognition rate and to reduce the computational cost. In this paper we present a framework that uses a variety of cues (face symmetry, sharpness, contrast, closeness of mouth, brightness and openness of the eye) to select the highest quality facial images available in a video sequence for recognition. Normalized feature scores are fused using a neural network and frames with high quality scores are used in a Local Gabor Binary Pattern Histogram Sequence based face recognition system. Experiments on the Honda/UCSD database shows that the proposed method selects the best quality face images in the video sequence, resulting in improved recognition performance.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper presents a reactive collision avoidance method for small unmanned rotorcraft using spherical image-based visual servoing. Only a single point feature is used to guide the aircraft in a safe spiral like trajectory around the target, whilst a spherical camera model ensures the target always remains visible. A decision strategy to stop the avoidance control is derived based on the properties of spiral like motion, and the effect of accurate range measurements on the control scheme is discussed. We show that using a poor range estimate does not significantly degrade the collision avoidance performance, thus relaxing the need for accurate range measurements. We present simulated and experimental results using a small quad rotor to validate the approach.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Treatment plans for conformal radiotherapy are based on an initial CT scan. The aim is to deliver the prescribed dose to the tumour, while minimising exposure to nearby organs. Recent advances make it possible to also obtain a Cone-Beam CT (CBCT) scan, once the patient has been positioned for treatment. A statistical model will be developed to compare these CBCT scans with the initial CT scan. Changes in the size, shape and position of the tumour and organs will be detected and quantified. Some progress has already been made in segmentation of prostate CBCT scans [1],[2],[3]. However, none of the existing approaches have taken full advantage of the prior information that is available. The planning CT scan is expertly annotated with contours of the tumour and nearby sensitive objects. This data is specific to the individual patient and can be viewed as a snapshot of spatial information at a point in time. There is an abundance of studies in the radiotherapy literature that describe the amount of variation in the relevant organs between treatments. The findings from these studies can form a basis for estimating the degree of uncertainty. All of this information can be incorporated as an informative prior into a Bayesian statistical model. This model will be developed using scans of CT phantoms, which are objects with known geometry. Thus, the accuracy of the model can be evaluated objectively. This will also enable comparison between alternative models.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The rank transform is one non-parametric transform which has been applied to the stereo matching problem The advantages of this transform include its invariance to radio metric distortion and its amenability to hardware implementation. This paper describes the derivation of the rank constraint for matching using the rank transform Previous work has shown that this constraint was capable of resolving ambiguous matches thereby improving match reliability A new matching algorithm incorporating this constraint was also proposed. This paper extends on this previous work by proposing a matching algorithm which uses a dimensional match surface in which the match score is computed for every possible template and match window combination. The principal advantage of this algorithm is that the use of the match surface enforces the left�right consistency and uniqueness constraints thus improving the algorithms ability to remove invalid matches Experimental results for a number of test stereo pairs show that the new algorithm is capable of identifying and removing a large number of in incorrect matches particularly in the case of occlusions

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A fundamental problem faced by stereo vision algorithms is that of determining correspondences between two images which comprise a stereo pair. This paper presents work towards the development of a new matching algorithm, based on the rank transform. This algorithm makes use of both area-based and edge-based information, and is therefore referred to as a hybrid algorithm. In addition, this algorithm uses a number of matching constraints,including the novel rank constraint. Results obtained using a number of test pairs show that the matching algorithm is capable of removing a significant proportion of invalid matches. The accuracy of matching in the vicinity of edges is also improved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A fundamental problem faced by stereo vision algorithms is that of determining correspondences between two images which comprise a stereo pair. This paper presents work towards the development of a new matching algorithm, based on the rank transform. This algorithm makes use of both area-based and edge-based information, and is therefore referred to as a hybrid algorithm. In addition, this algorithm uses a number of matching constraints, including the novel rank constraint. Results obtained using a number of test pairs show that the matching algorithm is capable of removing most invalid matches. The accuracy of matching in the vicinity of edges is also improved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The rank transform is a non-parametric technique which has been recently proposed for the stereo matching problem. The motivation behind its application to the matching problem is its invariance to certain types of image distortion and noise, as well as its amenability to real-time implementation. This paper derives an analytic expression for the process of matching using the rank transform, and then goes on to derive one constraint which must be satisfied for a correct match. This has been dubbed the rank order constraint or simply the rank constraint. Experimental work has shown that this constraint is capable of resolving ambiguous matches, thereby improving matching reliability. This constraint was incorporated into a new algorithm for matching using the rank transform. This modified algorithm resulted in an increased proportion of correct matches, for all test imagery used.