938 resultados para computer vision, facial expression recognition, swig, red5, actionscript, ruby on rails, html5


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Surveillance systems such as object tracking and abandoned object detection systems typically rely on a single modality of colour video for their input. These systems work well in controlled conditions but often fail when low lighting, shadowing, smoke, dust or unstable backgrounds are present, or when the objects of interest are a similar colour to the background. Thermal images are not affected by lighting changes or shadowing, and are not overtly affected by smoke, dust or unstable backgrounds. However, thermal images lack colour information which makes distinguishing between different people or objects of interest within the same scene difficult. ----- By using modalities from both the visible and thermal infrared spectra, we are able to obtain more information from a scene and overcome the problems associated with using either modality individually. We evaluate four approaches for fusing visual and thermal images for use in a person tracking system (two early fusion methods, one mid fusion and one late fusion method), in order to determine the most appropriate method for fusing multiple modalities. We also evaluate two of these approaches for use in abandoned object detection, and propose an abandoned object detection routine that utilises multiple modalities. To aid in the tracking and fusion of the modalities we propose a modified condensation filter that can dynamically change the particle count and features used according to the needs of the system. ----- We compare tracking and abandoned object detection performance for the proposed fusion schemes and the visual and thermal domains on their own. Testing is conducted using the OTCBVS database to evaluate object tracking, and data captured in-house to evaluate the abandoned object detection. Our results show that significant improvement can be achieved, and that a middle fusion scheme is most effective.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents an object tracking system that utilises a hybrid multi-layer motion segmentation and optical flow algorithm. While many tracking systems seek to combine multiple modalities such as motion and depth or multiple inputs within a fusion system to improve tracking robustness, current systems have avoided the combination of motion and optical flow. This combination allows the use of multiple modes within the object detection stage. Consequently, different categories of objects, within motion or stationary, can be effectively detected utilising either optical flow, static foreground or active foreground information. The proposed system is evaluated using the ETISEO database and evaluation metrics and compared to a baseline system utilising a single mode foreground segmentation technique. Results demonstrate a significant improvement in tracking results can be made through the incorporation of the additional motion information.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Automated crowd counting allows excessive crowding to be detected immediately, without the need for constant human surveillance. Current crowd counting systems are location specific, and for these systems to function properly they must be trained on a large amount of data specific to the target location. As such, configuring multiple systems to use is a tedious and time consuming exercise. We propose a scene invariant crowd counting system which can easily be deployed at a different location to where it was trained. This is achieved using a global scaling factor to relate crowd sizes from one scene to another. We demonstrate that a crowd counting system trained at one viewpoint can achieve a correct classification rate of 90% at a different viewpoint.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Performance evaluation of object tracking systems is typically performed after the data has been processed, by comparing tracking results to ground truth. Whilst this approach is fine when performing offline testing, it does not allow for real-time analysis of the systems performance, which may be of use for live systems to either automatically tune the system or report reliability. In this paper, we propose three metrics that can be used to dynamically asses the performance of an object tracking system. Outputs and results from various stages in the tracking system are used to obtain measures that indicate the performance of motion segmentation, object detection and object matching. The proposed dynamic metrics are shown to accurately indicate tracking errors when visually comparing metric results to tracking output, and are shown to display similar trends to the ETISEO metrics when comparing different tracking configurations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Object tracking systems require accurate segmentation of the objects from the background for effective tracking. Motion segmentation or optical flow can be used to segment incoming images. Whilst optical flow allows multiple moving targets to be separated based on their individual velocities, optical flow techniques are prone to errors caused by changing lighting and occlusions, both common in a surveillance environment. Motion segmentation techniques are more robust to fluctuating lighting and occlusions, but don't provide information on the direction of the motion. In this paper we propose a combined motion segmentation/optical flow algorithm for use in object tracking. The proposed algorithm uses the motion segmentation results to inform the optical flow calculations and ensure that optical flow is only calculated in regions of motion, and improve the performance of the optical flow around the edge of moving objects. Optical flow is calculated at pixel resolution and tracking of flow vectors is employed to improve performance and detect discontinuities, which can indicate the location of overlaps between objects. The algorithm is evaluated by attempting to extract a moving target within the flow images, given expected horizontal and vertical movement (i.e. the algorithms intended use for object tracking). Results show that the proposed algorithm outperforms other widely used optical flow techniques for this surveillance application.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Soft biometrics are characteristics that can be used to describe, but not uniquely identify an individual. These include traits such as height, weight, gender, hair, skin and clothing colour. Unlike traditional biometrics (i.e. face, voice) which require cooperation from the subject, soft biometrics can be acquired by surveillance cameras at range without any user cooperation. Whilst these traits cannot provide robust authentication, they can be used to provide coarse authentication or identification at long range, locate a subject who has been previously seen or who matches a description, as well as aid in object tracking. In this paper we propose three part (head, torso, legs) height and colour soft biometric models, and demonstrate their verification performance on a subset of the PETS 2006 database. We show that these models, whilst not as accurate as traditional biometrics, can still achieve acceptable rates of accuracy in situations where traditional biometrics cannot be applied.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In an automotive environment, the performance of a speech recognition system is affected by environmental noise if the speech signal is acquired directly from a microphone. Speech enhancement techniques are therefore necessary to improve the speech recognition performance. In this paper, a field-programmable gate array (FPGA) implementation of dual-microphone delay-and-sum beamforming (DASB) for speech enhancement is presented. As the first step towards a cost-effective solution, the implementation described in this paper uses a relatively high-end FPGA device to facilitate the verification of various design strategies and parameters. Experimental results show that the proposed design can produce output waveforms close to those generated by a theoretical (floating-point) model with modest usage of FPGA resources. Speech recognition experiments are also conducted on enhanced in-car speech waveforms produced by the FPGA in order to compare recognition performance with the floating-point representation running on a PC.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents an implementation of an aircraft pose and motion estimator using visual systems as the principal sensor for controlling an Unmanned Aerial Vehicle (UAV) or as a redundant system for an Inertial Measure Unit (IMU) and gyros sensors. First, we explore the applications of the unified theory for central catadioptric cameras for attitude and heading estimation, explaining how the skyline is projected on the catadioptric image and how it is segmented and used to calculate the UAV’s attitude. Then we use appearance images to obtain a visual compass, and we calculate the relative rotation and heading of the aerial vehicle. Additionally, we show the use of a stereo system to calculate the aircraft height and to measure the UAV’s motion. Finally, we present a visual tracking system based on Fuzzy controllers working in both a UAV and a camera pan and tilt platform. Every part is tested using the UAV COLIBRI platform to validate the different approaches, which include comparison of the estimated data with the inertial values measured onboard the helicopter platform and the validation of the tracking schemes on real flights.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Acquiring accurate silhouettes has many applications in computer vision. This is usually done through motion detection, or a simple background subtraction under highly controlled environments (i.e. chroma-key backgrounds). Lighting and contrast issues in typical outdoor or office environments make accurate segmentation very difficult in these scenes. In this paper, gradients are used in conjunction with intensity and colour to provide a robust segmentation of motion, after which graph cuts are utilised to refine the segmentation. The results presented using the ETISEO database demonstrate that an improved segmentation is achieved through the combined use of motion detection and graph cuts, particularly in complex scenes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposes a new method of using foreground silhouette images for human pose estimation. Labels are introduced to the silhouette images, providing an extra layer of information that can be used in the model fitting process. The pixels in the silhouettes are labelled according to the corresponding body part in the model of the current fit, with the labels propagated into the silhouette of the next frame to be used in the fitting for the next frame. Both single and multi-view implementations are detailed, with results showing performance improvements over only using standard unlabelled silhouettes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper discusses the use of models in automatic computer forensic analysis, and proposes and elaborates on a novel model for use in computer profiling, the computer profiling object model. The computer profiling object model is an information model which models a computer as objects with various attributes and inter-relationships. These together provide the information necessary for a human investigator or an automated reasoning engine to make judgements as to the probable usage and evidentiary value of a computer system. The computer profiling object model can be implemented so as to support automated analysis to provide an investigator with the information needed to decide whether manual analysis is required.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This document outlines the system submitted by the Speech and Audio Research Laboratory at the Queensland University of Technology (QUT) for the Speaker Identity Verication: Application task of EVALITA 2009. This submission consisted of a score-level fusion of three component systems, a joint-factor GMM system and two SVM systems using GLDS and GMM supervector kernels. Development and evaluation results are presented, demonstrating the effectiveness of this fused system approach.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Visual localization systems that are practical for autonomous vehicles in outdoor industrial applications must perform reliably in a wide range of conditions. Changing outdoor conditions cause difficulty by drastically altering the information available in the camera images. To confront the problem, we have developed a visual localization system that uses a surveyed three-dimensional (3D)-edge map of permanent structures in the environment. The map has the invariant properties necessary to achieve long-term robust operation. Previous 3D-edge map localization systems usually maintain a single pose hypothesis, making it difficult to initialize without an accurate prior pose estimate and also making them susceptible to misalignment with unmapped edges detected in the camera image. A multihypothesis particle filter is employed here to perform the initialization procedure with significant uncertainty in the vehicle's initial pose. A novel observation function for the particle filter is developed and evaluated against two existing functions. The new function is shown to further improve the abilities of the particle filter to converge given a very coarse estimate of the vehicle's initial pose. An intelligent exposure control algorithm is also developed that improves the quality of the pertinent information in the image. Results gathered over an entire sunny day and also during rainy weather illustrate that the localization system can operate in a wide range of outdoor conditions. The conclusion is that an invariant map, a robust multihypothesis localization algorithm, and an intelligent exposure control algorithm all combine to enable reliable visual localization through challenging outdoor conditions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Silhouettes are common features used by many applications in computer vision. For many of these algorithms to perform optimally, accurately segmenting the objects of interest from the background to extract the silhouettes is essential. Motion segmentation is a popular technique to segment moving objects from the background, however such algorithms can be prone to poor segmentation, particularly in noisy or low contrast conditions. In this paper, the work of [3] combining motion detection with graph cuts, is extended into two novel implementations that aim to allow greater uncertainty in the output of the motion segmentation, providing a less restricted input to the graph cut algorithm. The proposed algorithms are evaluated on a portion of the ETISEO dataset using hand segmented ground truth data, and an improvement in performance over the motion segmentation alone and the baseline system of [3] is shown.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Intelligent surveillance systems typically use a single visual spectrum modality for their input. These systems work well in controlled conditions, but often fail when lighting is poor, or environmental effects such as shadows, dust or smoke are present. Thermal spectrum imagery is not as susceptible to environmental effects, however thermal imaging sensors are more sensitive to noise and they are only gray scale, making distinguishing between objects difficult. Several approaches to combining the visual and thermal modalities have been proposed, however they are limited by assuming that both modalities are perfuming equally well. When one modality fails, existing approaches are unable to detect the drop in performance and disregard the under performing modality. In this paper, a novel middle fusion approach for combining visual and thermal spectrum images for object tracking is proposed. Motion and object detection is performed on each modality and the object detection results for each modality are fused base on the current performance of each modality. Modality performance is determined by comparing the number of objects tracked by the system with the number detected by each mode, with a small allowance made for objects entering and exiting the scene. The tracking performance of the proposed fusion scheme is compared with performance of the visual and thermal modes individually, and a baseline middle fusion scheme. Improvement in tracking performance using the proposed fusion approach is demonstrated. The proposed approach is also shown to be able to detect the failure of an individual modality and disregard its results, ensuring performance is not degraded in such situations.