971 resultados para Video tracking
Resumo:
Moving cameras are needed for a wide range of applications in robotics, vehicle systems, surveillance, etc. However, many foreground object segmentation methods reported in the literature are unsuitable for such settings; these methods assume that the camera is fixed and the background changes slowly, and are inadequate for segmenting objects in video if there is significant motion of the camera or background. To address this shortcoming, a new method for segmenting foreground objects is proposed that utilizes binocular video. The method is demonstrated in the application of tracking and segmenting people in video who are approximately facing the binocular camera rig. Given a stereo image pair, the system first tries to find faces. Starting at each face, the region containing the person is grown by merging regions from an over-segmented color image. The disparity map is used to guide this merging process. The system has been implemented on a consumer-grade PC, and tested on video sequences of people indoors obtained from a moving camera rig. As can be expected, the proposed method works well in situations where other foreground-background segmentation methods typically fail. We believe that this superior performance is partly due to the use of object detection to guide region merging in disparity/color foreground segmentation, and partly due to the use of disparity information available with a binocular rig, in contrast with most previous methods that assumed monocular sequences.
Resumo:
Particle filtering is a popular method used in systems for tracking human body pose in video. One key difficulty in using particle filtering is caused by the curse of dimensionality: generally a very large number of particles is required to adequately approximate the underlying pose distribution in a high-dimensional state space. Although the number of degrees of freedom in the human body is quite large, in reality, the subset of allowable configurations in state space is generally restricted by human biomechanics, and the trajectories in this allowable subspace tend to be smooth. Therefore, a framework is proposed to learn a low-dimensional representation of the high-dimensional human poses state space. This mapping can be learned using a Gaussian Process Latent Variable Model (GPLVM) framework. One important advantage of the GPLVM framework is that both the mapping to, and mapping from the embedded space are smooth; this facilitates sampling in the low-dimensional space, and samples generated in the low-dimensional embedded space are easily mapped back into the original highdimensional space. Moreover, human body poses that are similar in the original space tend to be mapped close to each other in the embedded space; this property can be exploited when sampling in the embedded space. The proposed framework is tested in tracking 2D human body pose using a Scaled Prismatic Model. Experiments on real life video sequences demonstrate the strength of the approach. In comparison with the Multiple Hypothesis Tracking and the standard Condensation algorithm, the proposed algorithm is able to maintain tracking reliably throughout the long test sequences. It also handles singularity and self occlusion robustly.
Resumo:
In professional sports there are in general three steps required to improve performance namely task definition, training and performance assessment. This process is iteratively repeated and feedback generated from quantitative performance measurement is in turn used for task redefinition. Task definition can be achieved in a number of ways including via video streaming or indeed and as is more common, by listening to coaching staff. However non-subjective performance evaluation is difficult due to the complexity of the movements involved. When considering the subset of sports where precision accuracy and repeatability are a necessity this problem becomes inherently more difficult to solve. Until recently sports such as martial arts, fencing and darts, where the smallest deviation from a prescribed movement goal can result in large outcome error, were deemed too difficult to characterise fully. Advances in technology, as illustrated by this study, now make this type of physiometry possible.
Resumo:
A novel, fast automatic motion segmentation approach is presented. It differs from conventional pixel or edge based motion segmentation approaches in that the proposed method uses labelled regions (facets) to segment various video objects from the background. Facets are clustered into objects based on their motion and proximity details using Bayesian logic. Because the number of facets is usually much lower than the number of edges and points, using facets can greatly reduce the computational complexity of motion segmentation. The proposed method can tackle efficiently the complexity of video object motion tracking, and offers potential for real-time content-based video annotation.
Resumo:
We present a multimodal detection and tracking algorithm for sensors composed of a camera mounted between two microphones. Target localization is performed on color-based change detection in the video modality and on time difference of arrival (TDOA) estimation between the two microphones in the audio modality. The TDOA is computed by multiband generalized cross correlation (GCC) analysis. The estimated directions of arrival are then postprocessed using a Riccati Kalman filter. The visual and audio estimates are finally integrated, at the likelihood level, into a particle filter (PF) that uses a zero-order motion model, and a weighted probabilistic data association (WPDA) scheme. We demonstrate that the Kalman filtering (KF) improves the accuracy of the audio source localization and that the WPDA helps to enhance the tracking performance of sensor fusion in reverberant scenarios. The combination of multiband GCC, KF, and WPDA within the particle filtering framework improves the performance of the algorithm in noisy scenarios. We also show how the proposed audiovisual tracker summarizes the observed scene by generating metadata that can be transmitted to other network nodes instead of transmitting the raw images and can be used for very low bit rate communication. Moreover, the generated metadata can also be used to detect and monitor events of interest.
Resumo:
We introduce an application for the detection of aberrant behaviour within home based environments, with a focus on repetitive actions, which may be present in instance of persons suffering from dementia. Video based analysis has been used to detect the motion of a person within a given scene in addition to tracking them over the time. Detection of repetitive actions has been based on the analysis of a person's trajectory using the principles of signal correlation. Along with the ability to detect repetitive motion the developed approach also has the ability to measure the amount of activity/inactivity within the scene during a given period of time. Our results showed that the developed approach had the ability to detect all patterns in the data set examined with an average accuracy of 96.67%. This work has therefore validated the proposed concept of video based analysis for the detection of repetitive activities.
Resumo:
http://bjo.bmj.com/content/suppl/2001/06/20/85.7.DC1 Leukocyte-endothelial cell interactions play an important role in the pathogenesis of various types of retinal vascular diseases, including diabetes, uveitis, and ischemic lesions. Over the last few years, several methods have been devised in which the scanning laser ophthalmoscope (SLO) is used to study leukocyte-endothelial interactions in vivo [1,2]. Previously we reported a noninvasive in vivo leukocyte tracking method using the SLO in rat. In this method, a nontoxic fluorescent agent (6-carboxyfluorescein diacetate, CFDA) was used to label leukocytes in vitro. Leukocyte velocities within the retinal and choroidal circulations were be quantified simultaneously [3]. None of the previous methods has been developed for imaging the murine fundus, mainly due to problems arising from the small size of the mouse eye. However, there are many advantages of using a murine model to study retinal vascular diseases such as enhanced genetic definition, increased range of reagents available for immunological studies and cost reduction. We have developed our SLO method such that we can track leukocytes in the mouse retinal and choroidal circulations.
Resumo:
In this paper, we introduce an efficient method for particle selection in tracking objects in complex scenes. Firstly, we improve the proposal distribution function of the tracking algorithm, including current observation, reducing the cost of evaluating particles with a very low likelihood. In addition, we use a partitioned sampling approach to decompose the dynamic state in several stages. It enables to deal with high-dimensional states without an excessive computational cost. To represent the color distribution, the appearance of the tracked object is modelled by sampled pixels. Based on this representation, the probability of any observation is estimated using non-parametric techniques in color space. As a result, we obtain a Probability color Density Image (PDI) where each pixel points its membership to the target color model. In this way, the evaluation of all particles is accelerated by computing the likelihood p(z|x) using the Integral Image of the PDI.
Resumo:
We address the problem of multi-target tracking in realistic crowded conditions by introducing a novel dual-stage online tracking algorithm. The problem of data-association between tracks and detections, based on appearance, is often complicated by partial occlusion. In the first stage, we address the issue of occlusion with a novel method of robust data-association, that can be used to compute the appearance similarity between tracks and detections without the need for explicit knowledge of the occluded regions. In the second stage, broken tracks are linked based on motion and appearance, using an online-learned linking model. The online-learned motion-model for track linking uses the confident tracks from the first stage tracker as training examples. The new approach has been tested on the town centre dataset and has performance comparable with the present state-of-the-art
Resumo:
Object tracking is an active research area nowadays due to its importance in human computer interface, teleconferencing and video surveillance. However, reliable tracking of objects in the presence of occlusions, pose and illumination changes is still a challenging topic. In this paper, we introduce a novel tracking approach that fuses two cues namely colour and spatio-temporal motion energy within a particle filter based framework. We conduct a measure of coherent motion over two image frames, which reveals the spatio-temporal dynamics of the target. At the same time, the importance of both colour and motion energy cues is determined in the stage of reliability evaluation. This determination helps maintain the performance of the tracking system against abrupt appearance changes. Experimental results demonstrate that the proposed method outperforms the other state of the art techniques in the used test datasets.
Resumo:
In this paper, we propose a novel visual tracking framework, based on a decision-theoretic online learning algorithm namely NormalHedge. To make NormalHedge more robust against noise, we propose an adaptive NormalHedge algorithm, which exploits the historic information of each expert to perform more accurate prediction than the standard NormalHedge. Technically, we use a set of weighted experts to predict the state of the target to be tracked over time. The weight of each expert is online learned by pushing the cumulative regret of the learner towards that of the expert. Our simulation experiments demonstrate the effectiveness of the proposed adaptive NormalHedge, compared to the standard NormalHedge method. Furthermore, the experimental results of several challenging video sequences show that the proposed tracking method outperforms several state-of-the-art methods.
Resumo:
Data registration refers to a series of techniques for matching or bringing similar objects or datasets together into alignment. These techniques enjoy widespread use in a diverse variety of applications, such as video coding, tracking, object and face detection and recognition, surveillance and satellite imaging, medical image analysis and structure from motion. Registration methods are as numerous as their manifold uses, from pixel level and block or feature based methods to Fourier domain methods.
This book is focused on providing algorithms and image and video techniques for registration and quality performance metrics. The authors provide various assessment metrics for measuring registration quality alongside analyses of registration techniques, introducing and explaining both familiar and state-of-the-art registration methodologies used in a variety of targeted applications.
Key features:
- Provides a state-of-the-art review of image and video registration techniques, allowing readers to develop an understanding of how well the techniques perform by using specific quality assessment criteria
- Addresses a range of applications from familiar image and video processing domains to satellite and medical imaging among others, enabling readers to discover novel methodologies with utility in their own research
- Discusses quality evaluation metrics for each application domain with an interdisciplinary approach from different research perspectives
Resumo:
Sparse representation based visual tracking approaches have attracted increasing interests in the community in recent years. The main idea is to linearly represent each target candidate using a set of target and trivial templates while imposing a sparsity constraint onto the representation coefficients. After we obtain the coefficients using L1-norm minimization methods, the candidate with the lowest error, when it is reconstructed using only the target templates and the associated coefficients, is considered as the tracking result. In spite of promising system performance widely reported, it is unclear if the performance of these trackers can be maximised. In addition, computational complexity caused by the dimensionality of the feature space limits these algorithms in real-time applications. In this paper, we propose a real-time visual tracking method based on structurally random projection and weighted least squares techniques. In particular, to enhance the discriminative capability of the tracker, we introduce background templates to the linear representation framework. To handle appearance variations over time, we relax the sparsity constraint using a weighed least squares (WLS) method to obtain the representation coefficients. To further reduce the computational complexity, structurally random projection is used to reduce the dimensionality of the feature space while preserving the pairwise distances between the data points in the feature space. Experimental results show that the proposed approach outperforms several state-of-the-art tracking methods.