500 resultados para Camera networks

em Queensland University of Technology - ePrints Archive


Relevância:

100.00% 100.00%

Publicador:

Resumo:

We describe the design and evaluation of a platform for networks of cameras in low-bandwidth, low-power sensor networks. In our work to date we have investigated two different DSP hardware/software platforms for undertaking the tasks of compression and object detection and tracking. We compare the relative merits of each of the hardware and software platforms in terms of both performance and energy consumption. Finally we discuss what we believe are the ongoing research questions for image processing in WSNs.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We describe a novel two stage approach to object localization and tracking using a network of wireless cameras and a mobile robot. In the first stage, a robot travels through the camera network while updating its position in a global coordinate frame which it broadcasts to the cameras. The cameras use this information, along with image plane location of the robot, to compute a mapping from their image planes to the global coordinate frame. This is combined with an occupancy map generated by the robot during the mapping process to track the objects. We present results with a nine node indoor camera network to demonstrate that this approach is feasible and offers acceptable level of accuracy in terms of object locations.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The selection of optimal camera configurations (camera locations, orientations etc.) for multi-camera networks remains an unsolved problem. Previous approaches largely focus on proposing various objective functions to achieve different tasks. Most of them, however, do not generalize well to large scale networks. To tackle this, we introduce a statistical formulation of the optimal selection of camera configurations as well as propose a Trans-Dimensional Simulated Annealing (TDSA) algorithm to effectively solve the problem. We compare our approach with a state-of-the-art method based on Binary Integer Programming (BIP) and show that our approach offers similar performance on small scale problems. However, we also demonstrate the capability of our approach in dealing with large scale problems and show that our approach produces better results than 2 alternative heuristics designed to deal with the scalability issue of BIP.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Camera calibration information is required in order for multiple camera networks to deliver more than the sum of many single camera systems. Methods exist for manually calibrating cameras with high accuracy. Manually calibrating networks with many cameras is, however, time consuming, expensive and impractical for networks that undergo frequent change. For this reason, automatic calibration techniques have been vigorously researched in recent years. Fully automatic calibration methods depend on the ability to automatically find point correspondences between overlapping views. In typical camera networks, cameras are placed far apart to maximise coverage. This is referred to as a wide base-line scenario. Finding sufficient correspondences for camera calibration in wide base-line scenarios presents a significant challenge. This thesis focuses on developing more effective and efficient techniques for finding correspondences in uncalibrated, wide baseline, multiple-camera scenarios. The project consists of two major areas of work. The first is the development of more effective and efficient view covariant local feature extractors. The second area involves finding methods to extract scene information using the information contained in a limited set of matched affine features. Several novel affine adaptation techniques for salient features have been developed. A method is presented for efficiently computing the discrete scale space primal sketch of local image features. A scale selection method was implemented that makes use of the primal sketch. The primal sketch-based scale selection method has several advantages over the existing methods. It allows greater freedom in how the scale space is sampled, enables more accurate scale selection, is more effective at combining different functions for spatial position and scale selection, and leads to greater computational efficiency. Existing affine adaptation methods make use of the second moment matrix to estimate the local affine shape of local image features. In this thesis, it is shown that the Hessian matrix can be used in a similar way to estimate local feature shape. The Hessian matrix is effective for estimating the shape of blob-like structures, but is less effective for corner structures. It is simpler to compute than the second moment matrix, leading to a significant reduction in computational cost. A wide baseline dense correspondence extraction system, called WiDense, is presented in this thesis. It allows the extraction of large numbers of additional accurate correspondences, given only a few initial putative correspondences. It consists of the following algorithms: An affine region alignment algorithm that ensures accurate alignment between matched features; A method for extracting more matches in the vicinity of a matched pair of affine features, using the alignment information contained in the match; An algorithm for extracting large numbers of highly accurate point correspondences from an aligned pair of feature regions. Experiments show that the correspondences generated by the WiDense system improves the success rate of computing the epipolar geometry of very widely separated views. This new method is successful in many cases where the features produced by the best wide baseline matching algorithms are insufficient for computing the scene geometry.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

We consider the problem of object tracking in a wireless multimedia sensor network (we mainly focus on the camera component in this work). The vast majority of current object tracking techniques, either centralised or distributed, assume unlimited energy, meaning these techniques don't translate well when applied within the constraints of low-power distributed systems. In this paper we develop and analyse a highly-scalable, distributed strategy to object tracking in wireless camera networks with limited resources. In the proposed system, cameras transmit descriptions of objects to a subset of neighbours, determined using a predictive forwarding strategy. The received descriptions are then matched at the next camera on the objects path using a probability maximisation process with locally generated descriptions. We show, via simulation, that our predictive forwarding and probabilistic matching strategy can significantly reduce the number of object-misses, ID-switches and ID-losses; it can also reduce the number of required transmissions over a simple broadcast scenario by up to 67%. We show that our system performs well under realistic assumptions about matching objects appearance using colour.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Automated crowd counting has become an active field of computer vision research in recent years. Existing approaches are scene-specific, as they are designed to operate in the single camera viewpoint that was used to train the system. Real world camera networks often span multiple viewpoints within a facility, including many regions of overlap. This paper proposes a novel scene invariant crowd counting algorithm that is designed to operate across multiple cameras. The approach uses camera calibration to normalise features between viewpoints and to compensate for regions of overlap. This compensation is performed by constructing an 'overlap map' which provides a measure of how much an object at one location is visible within other viewpoints. An investigation into the suitability of various feature types and regression models for scene invariant crowd counting is also conducted. The features investigated include object size, shape, edges and keypoints. The regression models evaluated include neural networks, K-nearest neighbours, linear and Gaussian process regresion. Our experiments demonstrate that accurate crowd counting was achieved across seven benchmark datasets, with optimal performance observed when all features were used and when Gaussian process regression was used. The combination of scene invariance and multi camera crowd counting is evaluated by training the system on footage obtained from the QUT camera network and testing it on three cameras from the PETS 2009 database. Highly accurate crowd counting was observed with a mean relative error of less than 10%. Our approach enables a pre-trained system to be deployed on a new environment without any additional training, bringing the field one step closer toward a 'plug and play' system.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The selection of optimal camera configurations (camera locations, orientations, etc.) for multi-camera networks remains an unsolved problem. Previous approaches largely focus on proposing various objective functions to achieve different tasks. Most of them, however, do not generalize well to large scale networks. To tackle this, we propose a statistical framework of the problem as well as propose a trans-dimensional simulated annealing algorithm to effectively deal with it. We compare our approach with a state-of-the-art method based on binary integer programming (BIP) and show that our approach offers similar performance on small scale problems. However, we also demonstrate the capability of our approach in dealing with large scale problems and show that our approach produces better results than two alternative heuristics designed to deal with the scalability issue of BIP. Last, we show the versatility of our approach using a number of specific scenarios.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The relationship between multiple cameras viewing the same scene may be discovered automatically by finding corresponding points in the two views and then solving for the camera geometry. In camera networks with sparsely placed cameras, low resolution cameras or in scenes with few distinguishable features it may be difficult to find a sufficient number of reliable correspondences from which to compute geometry. This paper presents a method for extracting a larger number of correspondences from an initial set of putative correspondences without any knowledge of the scene or camera geometry. The method may be used to increase the number of correspondences and make geometry computations possible in cases where existing methods have produced insufficient correspondences.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper proposes a semi-supervised intelligent visual surveillance system to exploit the information from multi-camera networks for the monitoring of people and vehicles. Modules are proposed to perform critical surveillance tasks including: the management and calibration of cameras within a multi-camera network; tracking of objects across multiple views; recognition of people utilising biometrics and in particular soft-biometrics; the monitoring of crowds; and activity recognition. Recent advances in these computer vision modules and capability gaps in surveillance technology are also highlighted.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In a commercial environment, it is advantageous to know how long it takes customers to move between different regions, how long they spend in each region, and where they are likely to go as they move from one location to another. Presently, these measures can only be determined manually, or through the use of hardware tags (i.e. RFID). Soft biometrics are characteristics that can be used to describe, but not uniquely identify an individual. They include traits such as height, weight, gender, hair, skin and clothing colour. Unlike traditional biometrics, soft biometrics can be acquired by surveillance cameras at range without any user cooperation. While these traits cannot provide robust authentication, they can be used to provide identification at long range, and aid in object tracking and detection in disjoint camera networks. In this chapter we propose using colour, height and luggage soft biometrics to determine operational statistics relating to how people move through a space. A novel average soft biometric is used to locate people who look distinct, and these people are then detected at various locations within a disjoint camera network to gradually obtain operational statistics

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper presents an approach for the automatic calibration of low-cost cameras which are assumed to be restricted in their freedom of movement to either pan or tilt movements. Camera parameters, including focal length, principal point, lens distortion parameter and the angle and axis of rotation, can be recovered from a minimum set of two images of the camera, provided that the axis of rotation between the two images goes through the camera’s optical center and is parallel to either the vertical (panning) or horizontal (tilting) axis of the image. Previous methods for auto-calibration of cameras based on pure rotations fail to work in these two degenerate cases. In addition, our approach includes a modified RANdom SAmple Consensus (RANSAC) algorithm, as well as improved integration of the radial distortion coefficient in the computation of inter-image homographies. We show that these modifications are able to increase the overall efficiency, reliability and accuracy of the homography computation and calibration procedure using both synthetic and real image sequences

Relevância:

60.00% 60.00%

Publicador:

Resumo:

After first observing a person, the task of person re-identification involves recognising an individual at different locations across a network of cameras at a later time. Traditionally, this task has been performed by first extracting appearance features of an individual and then matching these features to the previous observation. However, identifying an individual based solely on appearance can be ambiguous, particularly when people wear similar clothing (i.e. people dressed in uniforms in sporting and school settings). This task is made more difficult when the resolution of the input image is small as is typically the case in multi-camera networks. To circumvent these issues, we need to use other contextual cues. In this paper, we use "group" information as our contextual feature to aid in the re-identification of a person, which is heavily motivated by the fact that people generally move together as a collective group. To encode group context, we learn a linear mapping function to assign each person to a "role" or position within the group structure. We then combine the appearance and group context cues using a weighted summation. We demonstrate how this improves performance of person re-identification in a sports environment over appearance based-features.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Disjoint top-view networked cameras are among the most commonly utilized networks in many applications. One of the open questions for these cameras' study is the computation of extrinsic parameters (positions and orientations), named extrinsic calibration or localization of cameras. Current approaches either rely on strict assumptions of the object motion for accurate results or fail to provide results of high accuracy without the requirement of the object motion. To address these shortcomings, we present a location-constrained maximum a posteriori (LMAP) approach by applying known locations in the surveillance area, some of which would be passed by the object opportunistically. The LMAP approach formulates the problem as a joint inference of the extrinsic parameters and object trajectory based on the cameras' observations and the known locations. In addition, a new task-oriented evaluation metric, named MABR (the Maximum value of All image points' Back-projected localization errors' L2 norms Relative to the area of field of view), is presented to assess the quality of the calibration results in an indoor object tracking context. Finally, results herein demonstrate the superior performance of the proposed method over the state-of-the-art algorithm based on the presented MABR and classical evaluation metric in simulations and real experiments.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The location of previously unseen and unregistered individuals in complex camera networks from semantic descriptions is a time consuming and often inaccurate process carried out by human operators, or security staff on the ground. To promote the development and evaluation of automated semantic description based localisation systems, we present a new, publicly available, unconstrained 110 sequence database, collected from 6 stationary cameras. Each sequence contains detailed semantic information for a single search subject who appears in the clip (gender, age, height, build, hair and skin colour, clothing type, texture and colour), and between 21 and 290 frames for each clip are annotated with the target subject location (over 11,000 frames are annotated in total). A novel approach for localising a person given a semantic query is also proposed and demonstrated on this database. The proposed approach incorporates clothing colour and type (for clothing worn below the waist), as well as height and build to detect people. A method to assess the quality of candidate regions, as well as a symmetry driven approach to aid in modelling clothing on the lower half of the body, is proposed within this approach. An evaluation on the proposed dataset shows that a relative improvement in localisation accuracy of up to 21 is achieved over the baseline technique.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Novel computer vision techniques have been developed to automatically detect unusual events in crowded scenes from video feeds of surveillance cameras. The research is useful in the design of the next generation intelligent video surveillance systems. Two major contributions are the construction of a novel machine learning model for multiple instance learning through compressive sensing, and the design of novel feature descriptors in the compressed video domain.