8 resultados para occlusions
em Cambridge University Engineering Department Publications Database
Resumo:
In this paper, we describe a video tracking application using the dual-tree polar matching algorithm. The models are specified in a probabilistic setting, and a particle ilter is used to perform the sequential inference. Computer simulations demonstrate the ability of the algorithm to track a simulated video moving target in an urban environment with complete and partial occlusions. © The Institution of Engineering and Technology.
Resumo:
We propose a system that can reliably track multiple cars in congested traffic environments. Our system's key basis is the implementation of a sequential Monte Carlo algorithm, which introduces robustness against problems arising due to the proximity between vehicles. By directly modelling occlusions and collisions between cars we obtain promising results on an urban traffic dataset. Extensions to this initial framework are also suggested. © 2010 IEEE.
Resumo:
On-site tracking in open construction sites is often difficult because of the large amounts of items that are present and need to be tracked. Additionally, the amounts of occlusions/obstructions present create a highly complex tracking environment. Existing tracking methods are based mainly on Radio Frequency technologies, including Global Positioning Systems (GPS), Radio Frequency Identification (RFID), Bluetooth and Wireless Fidelity (Wi-Fi, Ultra-Wideband, etc). These methods require considerable amounts of pre-processing time since they need to manually deploy tags and keep record of the items they are placed on. In construction sites with numerous entities, tags installation, maintenance and decommissioning become an issue since it increases the cost and time needed to implement these tracking methods. This paper presents a novel method for open site tracking with construction cameras based on machine vision. According to this method, video feed is collected from on site video cameras, and the user selects the entity he wishes to track. The entity is tracked in each video using 2D vision tracking. Epipolar geometry is then used to calculate the depth of the marked area to provide the 3D location of the entity. This method addresses the limitations of radio frequency methods by being unobtrusive and using inexpensive, and easy to deploy equipment. The method has been implemented in a C++ prototype and preliminary results indicate its effectiveness
Resumo:
The lack of viable methods to map and label existing infrastructure is one of the engineering grand challenges for the 21st century. For instance, over two thirds of the effort needed to geometrically model even simple infrastructure is spent on manually converting a cloud of points to a 3D model. The result is that few facilities today have a complete record of as-built information and that as-built models are not produced for the vast majority of new construction and retrofit projects. This leads to rework and design changes that can cost up to 10% of the installed costs. Automatically detecting building components could address this challenge. However, existing methods for detecting building components are not view and scale-invariant, or have only been validated in restricted scenarios that require a priori knowledge without considering occlusions. This leads to their constrained applicability in complex civil infrastructure scenes. In this paper, we test a pose-invariant method of labeling existing infrastructure. This method simultaneously detects objects and estimates their poses. It takes advantage of a recent novel formulation for object detection and customizes it to generic civil infrastructure scenes. Our preliminary experiments demonstrate that this method achieves convincing recognition results.
Resumo:
Optical motion capture systems suffer from marker occlusions resulting in loss of useful information. This paper addresses the problem of real-time joint localisation of legged skeletons in the presence of such missing data. The data is assumed to be labelled 3d marker positions from a motion capture system. An integrated framework is presented which predicts the occluded marker positions using a Variable Turn Model within an Unscented Kalman filter. Inferred information from neighbouring markers is used as observation states; these constraints are efficient, simple, and real-time implementable. This work also takes advantage of the common case that missing markers are still visible to a single camera, by combining predictions with under-determined positions, resulting in more accurate predictions. An Inverse Kinematics technique is then applied ensuring that the bone lengths remain constant over time; the system can thereby maintain a continuous data-flow. The marker and Centre of Rotation (CoR) positions can be calculated with high accuracy even in cases where markers are occluded for a long period of time. Our methodology is tested against some of the most popular methods for marker prediction and the results confirm that our approach outperforms these methods in estimating both marker and CoR positions. © 2012 Springer-Verlag.
Resumo:
This paper is about detecting bipedal motion in video sequences by using point trajectories in a framework of classification. Given a number of point trajectories, we find a subset of points which are arising from feet in bipedal motion by analysing their spatio-temporal correlation in a pairwise fashion. To this end, we introduce probabilistic trajectories as our new features which associate each point over a sufficiently long time period in the presence of noise. They are extracted from directed acyclic graphs whose edges represent temporal point correspondences and are weighted with their matching probability in terms of appearance and location. The benefit of the new representation is that it practically tolerates inherent ambiguity for example due to occlusions. We then learn the correlation between the motion of two feet using the probabilistic trajectories in a decision forest classifier. The effectiveness of the algorithm is demonstrated in experiments on image sequences captured with a static camera, and extensions to deal with a moving camera are discussed. © 2013 Elsevier B.V. All rights reserved.
Resumo:
Temporal synchronization of multiple video recordings of the same dynamic event is a critical task in many computer vision applications e.g. novel view synthesis and 3D reconstruction. Typically this information is implied through the time-stamp information embedded in the video streams. User-generated videos shot using consumer grade equipment do not contain this information; hence, there is a need to temporally synchronize signals using the visual information itself. Previous work in this area has either assumed good quality data with relatively simple dynamic content or the availability of precise camera geometry. Our first contribution is a synchronization technique which tries to establish correspondence between feature trajectories across views in a novel way, and specifically targets the kind of complex content found in consumer generated sports recordings, without assuming precise knowledge of fundamental matrices or homographies. We evaluate performance using a number of real video recordings and show that our method is able to synchronize to within 1 sec, which is significantly better than previous approaches. Our second contribution is a robust and unsupervised view-invariant activity recognition descriptor that exploits recurrence plot theory on spatial tiles. The descriptor is individually shown to better characterize the activities from different views under occlusions than state-of-the-art approaches. We combine this descriptor with our proposed synchronization method and show that it can further refine the synchronization index. © 2013 ACM.