226 resultados para visual object detection

em Queensland University of Technology - ePrints Archive


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents visual detection and classification of light vehicles and personnel on a mine site.We capitalise on the rapid advances of ConvNet based object recognition but highlight that a naive black box approach results in a significant number of false positives. In particular, the lack of domain specific training data and the unique landscape in a mine site causes a high rate of errors. We exploit the abundance of background-only images to train a k-means classifier to complement the ConvNet. Furthermore, localisation of objects of interest and a reduction in computation is enabled through region proposals. Our system is tested on over 10km of real mine site data and we were able to detect both light vehicles and personnel. We show that the introduction of our background model can reduce the false positive rate by an order of magnitude.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We propose a method for learning specific object representations that can be applied (and reused) in visual detection and identification tasks. A machine learning technique called Cartesian Genetic Programming (CGP) is used to create these models based on a series of images. Our research investigates how manipulation actions might allow for the development of better visual models and therefore better robot vision. This paper describes how visual object representations can be learned and improved by performing object manipulation actions, such as, poke, push and pick-up with a humanoid robot. The improvement can be measured and allows for the robot to select and perform the `right' action, i.e. the action with the best possible improvement of the detector.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Abandoned object detection (AOD) systems are required to run in high traffic situations, with high levels of occlusion. Systems rely on background segmentation techniques to locate abandoned objects, by detecting areas of motion that have stopped. This is often achieved by using a medium term motion detection routine to detect long term changes in the background. When AOD systems are integrated into person tracking system, this often results in two separate motion detectors being used to handle the different requirements. We propose a motion detection system that is capable of detecting medium term motion as well as regular motion. Multiple layers of medium term (static) motion can be detected and segmented. We demonstrate the performance of this motion detection system and as part of an abandoned object detection system.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The ability to automate forced landings in an emergency such as engine failure is an essential ability to improve the safety of Unmanned Aerial Vehicles operating in General Aviation airspace. By using active vision to detect safe landing zones below the aircraft, the reliability and safety of such systems is vastly improved by gathering up-to-the-minute information about the ground environment. This paper presents the Site Detection System, a methodology utilising a downward facing camera to analyse the ground environment in both 2D and 3D, detect safe landing sites and characterise them according to size, shape, slope and nearby obstacles. A methodology is presented showing the fusion of landing site detection from 2D imagery with a coarse Digital Elevation Map and dense 3D reconstructions using INS-aided Structure-from-Motion to improve accuracy. Results are presented from an experimental flight showing the precision/recall of landing sites in comparison to a hand-classified ground truth, and improved performance with the integration of 3D analysis from visual Structure-from-Motion.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes a novel obstacle detection system for autonomous robots in agricultural field environments that uses a novelty detector to inform stereo matching. Stereo vision alone erroneously detects obstacles in environments with ambiguous appearance and ground plane such as in broad-acre crop fields with harvested crop residue. The novelty detector estimates the probability density in image descriptor space and incorporates image-space positional understanding to identify potential regions for obstacle detection using dense stereo matching. The results demonstrate that the system is able to detect obstacles typical to a farm at day and night. This system was successfully used as the sole means of obstacle detection for an autonomous robot performing a long term two hour coverage task travelling 8.5 km.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present a novel approach for multi-object detection in aerial videos based on tracking. The proposed method mainly involves three steps. Firstly, the spatial-temporal saliency is employed to detect moving objects. Secondly, the detected objects are tracked by mean shift in the subsequent frames. Finally, the saliency results are fused with the weight map generated by tracking to get refined detection results, and in turn the modified detection results are used to update the tracking models. The proposed algorithm is evaluated on VIVID aerial videos, and the results show that our approach can reliably detect moving objects even in challenging situations. Meanwhile, the proposed method can process videos in real time, without the effect of time delay.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, the problem of moving object detection in aerial video is addressed. While motion cues have been extensively exploited in the literature, how to use spatial information is still an open problem. To deal with this issue, we propose a novel hierarchical moving target detection method based on spatiotemporal saliency. Temporal saliency is used to get a coarse segmentation, and spatial saliency is extracted to obtain the object’s appearance details in candidate motion regions. Finally, by combining temporal and spatial saliency information, we can get refined detection results. Additionally, in order to give a full description of the object distribution, spatial saliency is detected in both pixel and region levels based on local contrast. Experiments conducted on the VIVID dataset show that the proposed method is efficient and accurate.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper investigates how neuronal activation for naming photographs of objects is influenced by the addition of appropriate colour or sound. Behaviourally, both colour and sound are known to facilitate object recognition from visual form. However, previous functional imaging studies have shown inconsistent effects. For example, the addition of appropriate colour has been shown to reduce antero-medial temporal activation whereas the addition of sound has been shown to increase posterior superior temporal activation. Here we compared the effect of adding colour or sound cues in the same experiment. We found that the addition of either the appropriate colour or sound increased activation for naming photographs of objects in bilateral occipital regions and the right anterior fusiform. Moreover, the addition of colour reduced left antero-medial temporal activation but this effect was not observed for the addition of object sound. We propose that activation in bilateral occipital and right fusiform areas precedes the integration of visual form with either its colour or associated sound. In contrast, left antero-medial temporal activation is reduced because object recognition is facilitated after colour and form have been integrated.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Object detection is a fundamental task in many computer vision applications, therefore the importance of evaluating the quality of object detection is well acknowledged in this domain. This process gives insight into the capabilities of methods in handling environmental changes. In this paper, a new method for object detection is introduced that combines the Selective Search and EdgeBoxes. We tested these three methods under environmental variations. Our experiments demonstrate the outperformance of the combination method under illumination and view point variations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

There is an increased interest in the use of Unmanned Aerial Vehicles for load transportation from environmental remote sensing to construction and parcel delivery. One of the main challenges is accurate control of the load position and trajectory. This paper presents an assessment of real flight trials for the control of an autonomous multi-rotor with a suspended slung load using only visual feedback to determine the load position. This method uses an onboard camera to take advantage of a common visual marker detection algorithm to robustly detect the load location. The load position is calculated using an onboard processor, and transmitted over a wireless network to a ground station integrating MATLAB/SIMULINK and Robotic Operating System (ROS) and a Model Predictive Controller (MPC) to control both the load and the UAV. To evaluate the system performance, the position of the load determined by the visual detection system in real flight is compared with data received by a motion tracking system. The multi-rotor position tracking performance is also analyzed by conducting flight trials using perfect load position data and data obtained only from the visual system. Results show very accurate estimation of the load position (~5% Offset) using only the visual system and demonstrate that the need for an external motion tracking system is not needed for this task.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Stationary processes are random variables whose value is a signal and whose distribution is invariant to translation in the domain of the signal. They are intimately connected to convolution, and therefore to the Fourier transform, since the covariance matrix of a stationary process is a Toeplitz matrix, and Toeplitz matrices are the expression of convolution as a linear operator. This thesis utilises this connection in the study of i) efficient training algorithms for object detection and ii) trajectory-based non-rigid structure-from-motion.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The vision sense of standalone robots is limited by line of sight and onboard camera capabilities, but processing video from remote cameras puts a high computational burden on robots. This paper describes the Distributed Robotic Vision Service, DRVS, which implements an on-demand distributed visual object detection service. Robots specify visual information requirements in terms of regions of interest and object detection algorithms. DRVS dynamically distributes the object detection computation to remote vision systems with processing capabilities, and the robots receive high-level object detection information. DRVS relieves robots of managing sensor discovery and reduces data transmission compared to image sharing models of distributed vision. Navigating a sensorless robot from remote vision systems is demonstrated in simulation as a proof of concept.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Robotic vision is limited by line of sight and onboard camera capabilities. Robots can acquire video or images from remote cameras, but processing additional data has a computational burden. This paper applies the Distributed Robotic Vision Service, DRVS, to robot path planning using data outside line-of-sight of the robot. DRVS implements a distributed visual object detection service to distributes the computation to remote camera nodes with processing capabilities. Robots request task-specific object detection from DRVS by specifying a geographic region of interest and object type. The remote camera nodes perform the visual processing and send the high-level object information to the robot. Additionally, DRVS relieves robots of sensor discovery by dynamically distributing object detection requests to remote camera nodes. Tested over two different indoor path planning tasks DRVS showed dramatic reduction in mobile robot compute load and wireless network utilization.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Intelligent surveillance systems typically use a single visual spectrum modality for their input. These systems work well in controlled conditions, but often fail when lighting is poor, or environmental effects such as shadows, dust or smoke are present. Thermal spectrum imagery is not as susceptible to environmental effects, however thermal imaging sensors are more sensitive to noise and they are only gray scale, making distinguishing between objects difficult. Several approaches to combining the visual and thermal modalities have been proposed, however they are limited by assuming that both modalities are perfuming equally well. When one modality fails, existing approaches are unable to detect the drop in performance and disregard the under performing modality. In this paper, a novel middle fusion approach for combining visual and thermal spectrum images for object tracking is proposed. Motion and object detection is performed on each modality and the object detection results for each modality are fused base on the current performance of each modality. Modality performance is determined by comparing the number of objects tracked by the system with the number detected by each mode, with a small allowance made for objects entering and exiting the scene. The tracking performance of the proposed fusion scheme is compared with performance of the visual and thermal modes individually, and a baseline middle fusion scheme. Improvement in tracking performance using the proposed fusion approach is demonstrated. The proposed approach is also shown to be able to detect the failure of an individual modality and disregard its results, ensuring performance is not degraded in such situations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Visual activity detection of lip movements can be used to overcome the poor performance of voice activity detection based solely in the audio domain, particularly in noisy acoustic conditions. However, most of the research conducted in visual voice activity detection (VVAD) has neglected addressing variabilities in the visual domain such as viewpoint variation. In this paper we investigate the effectiveness of the visual information from the speaker’s frontal and profile views (i.e left and right side views) for the task of VVAD. As far as we are aware, our work constitutes the first real attempt to study this problem. We describe our visual front end approach and the Gaussian mixture model (GMM) based VVAD framework, and report the experimental results using the freely available CUAVE database. The experimental results show that VVAD is indeed possible from profile views and we give a quantitative comparison of VVAD based on frontal and profile views The results presented are useful in the development of multi-modal Human Machine Interaction (HMI) using a single camera, where the speaker’s face may not always be frontal.