118 resultados para Vision-based row tracking algorithm

em Cambridge University Engineering Department Publications Database


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This Chapter presents a vision-based system for touch-free interaction with a display at a distance. A single camera is fixed on top of the screen and is pointing towards the user. An attention mechanism allows the user to start the interaction and control a screen pointer by moving their hand in a fist pose directed at the camera. On-screen items can be chosen by a selection mechanism. Current sample applications include browsing video collections as well as viewing a gallery of 3D objects, which the user can rotate with their hand motion. We have included an up-to-date review of hand tracking methods, and comment on the merits and shortcomings of previous approaches. The proposed tracker uses multiple cues, appearance, color, and motion, for robustness. As the space of possible observation models is generally too large for exhaustive online search, we select models that are suitable for the particular tracking task at hand. During a training stage, various off-the-shelf trackers are evaluated. From this data differentmethods of fusing them online are investigated, including parallel and cascaded tracker evaluation. For the case of fist tracking, combining a small number of observers in a cascade results in an efficient algorithm that is used in our gesture interface. The system has been on public display at conferences where over a hundred users have engaged with it. © 2010 Springer-Verlag Berlin Heidelberg.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a novel coarse-to-fine global localization approach inspired by object recognition and text retrieval techniques. Harris-Laplace interest points characterized by scale-invariant transformation feature descriptors are used as natural landmarks. They are indexed into two databases: a location vector space model (LVSM) and a location database. The localization process consists of two stages: coarse localization and fine localization. Coarse localization from the LVSM is fast, but not accurate enough, whereas localization from the location database using a voting algorithm is relatively slow, but more accurate. The integration of coarse and fine stages makes fast and reliable localization possible. If necessary, the localization result can be verified by epipolar geometry between the representative view in the database and the view to be localized. In addition, the localization system recovers the position of the camera by essential matrix decomposition. The localization system has been tested in indoor and outdoor environments. The results show that our approach is efficient and reliable. © 2006 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a novel coarse-to-fine global localization approach that is inspired by object recognition and text retrieval techniques. Harris-Laplace interest points characterized by SIFT descriptors are used as natural land-marks. These descriptors are indexed into two databases: an inverted index and a location database. The inverted index is built based on a visual vocabulary learned from the feature descriptors. In the location database, each location is directly represented by a set of scale invariant descriptors. The localization process consists of two stages: coarse localization and fine localization. Coarse localization from the inverted index is fast but not accurate enough; whereas localization from the location database using voting algorithm is relatively slow but more accurate. The combination of coarse and fine stages makes fast and reliable localization possible. In addition, if necessary, the localization result can be verified by epipolar geometry between the representative view in database and the view to be localized. Experimental results show that our approach is efficient and reliable. ©2005 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We have developed a novel human facial tracking system that operates in real time at a video frame rate without needing any special hardware. The approach is based on the use of Lie algebra, and uses three-dimensional feature points on the targeted human face. It is assumed that the roughly estimated facial model (relative coordinates of the three-dimensional feature points) is known. First, the initial feature positions of the face are determined using a model fitting technique. Then, the tracking is operated by the following sequence: (1) capture the new video frame and render feature points to the image plane; (2) search for new positions of the feature points on the image plane; (3) get the Euclidean matrix from the moving vector and the three-dimensional information for the points; and (4) rotate and translate the feature points by using the Euclidean matrix, and render the new points on the image plane. The key algorithm of this tracker is to estimate the Euclidean matrix by using a least square technique based on Lie algebra. The resulting tracker performed very well on the task of tracking a human face.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present a gradient-based motion capture system that robustly tracks a human hand, based on abstracted visual information - silhouettes. Despite the ambiguity in the visual data and despite the vulnerability of gradient-based methods in the face of such ambiguity, we minimise problems related to misfit by using a model of the hand's physiology, which is entirely non-visual, subject-invariant, and assumed to be known a priori. By modelling seven distinct aspects of the hand's physiology we derive prior densities which are incorporated into the tracking system within a Bayesian framework. We demonstrate how the posterior is formed, and how our formulation leads to the extraction of the maximum a posteriori estimate using a gradient-based search. Our results demonstrate an enormous improvement in tracking precision and reliability, while also achieving near real-time performance. © 2009 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Estimating the fundamental matrix (F), to determine the epipolar geometry between a pair of images or video frames, is a basic step for a wide variety of vision-based functions used in construction operations, such as camera-pair calibration, automatic progress monitoring, and 3D reconstruction. Currently, robust methods (e.g., SIFT + normalized eight-point algorithm + RANSAC) are widely used in the construction community for this purpose. Although they can provide acceptable accuracy, the significant amount of required computational time impedes their adoption in real-time applications, especially video data analysis with many frames per second. Aiming to overcome this limitation, this paper presents and evaluates the accuracy of a solution to find F by combining the use of two speedy and consistent methods: SURF for the selection of a robust set of point correspondences and the normalized eight-point algorithm. This solution is tested extensively on construction site image pairs including changes in viewpoint, scale, illumination, rotation, and moving objects. The results demonstrate that this method can be used for real-time applications (5 image pairs per second with the resolution of 640 × 480) involving scenes of the built environment.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Three-dimensional (3-D) spatial data of a transportation infrastructure contain useful information for civil engineering applications, including as-built documentation, on-site safety enhancements, and progress monitoring. Several techniques have been developed for acquiring 3-D point coordinates of infrastructure, such as laser scanning. Although the method yields accurate results, the high device costs and human effort required render the process infeasible for generic applications in the construction industry. A quick and reliable approach, which is based on the principles of stereo vision, is proposed for generating a depth map of an infrastructure. Initially, two images are captured by two similar stereo cameras at the scene of the infrastructure. A Harris feature detector is used to extract feature points from the first view, and an innovative adaptive window-matching technique is used to compute feature point correspondences in the second view. A robust algorithm computes the nonfeature point correspondences. Thus, the correspondences of all the points in the scene are obtained. After all correspondences have been obtained, the geometric principles of stereo vision are used to generate a dense depth map of the scene. The proposed algorithm has been tested on several data sets, and results illustrate its potential for stereo correspondence and depth map generation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Vision based tracking can provide the spatial location of project related entities such as equipment, workers, and materials in a large-scale congested construction site. It tracks entities in a video stream by inferring their motion. To initiate the process, it is required to determine the pixel areas of the entities to be tracked in the following consecutive video frames. For the purpose of fully automating the process, this paper presents an automated way of initializing trackers using Semantic Texton Forests (STFs) method. STFs method performs simultaneously the segmentation of the image and the classification of the segments based on the low-level semantic information and the context information. In this paper, STFs method is tested in the case of wheel loaders recognition. In the experiments, wheel loaders are further divided into several parts such as wheels and body parts to help learn the context information. The results show 79% accuracy of recognizing the pixel areas of the wheel loader. These results signify that STFs method has the potential to automate the initialization process of vision based tracking.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Three-dimensional (3-D) spatial data of a transportation infrastructure contain useful information for civil engineering applications, including as-built documentation, on-site safety enhancements, and progress monitoring. Several techniques have been developed for acquiring 3-D point coordinates of infrastructure, such as laser scanning. Although the method yields accurate results, the high device costs and human effort required render the process infeasible for generic applications in the construction industry. A quick and reliable approach, which is based on the principles of stereo vision, is proposed for generating a depth map of an infrastructure. Initially, two images are captured by two similar stereo cameras at the scene of the infrastructure. A Harris feature detector is used to extract feature points from the first view, and an innovative adaptive window-matching technique is used to compute feature point correspondences in the second view. A robust algorithm computes the nonfeature point correspondences. Thus, the correspondences of all the points in the scene are obtained. After all correspondences have been obtained, the geometric principles of stereo vision are used to generate a dense depth map of the scene. The proposed algorithm has been tested on several data sets, and results illustrate its potential for stereo correspondence and depth map generation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Displacement estimation is a key step in the evaluation of tissue elasticity by quasistatic strain imaging. An efficient approach may incorporate a tracking strategy whereby each estimate is initially obtained from its neighbours' displacements and then refined through a localized search. This increases the accuracy and reduces the computational expense compared with exhaustive search. However, simple tracking strategies fail when the target displacement map exhibits complex structure. For example, there may be discontinuities and regions of indeterminate displacement caused by decorrelation between the pre- and post-deformation radio frequency (RF) echo signals. This paper introduces a novel displacement tracking algorithm, with a search strategy guided by a data quality indicator. Comparisons with existing methods show that the proposed algorithm is more robust when the displacement distribution is challenging.