810 resultados para Video Teleconferencing
Resumo:
Temporal synchronization of multiple video recordings of the same dynamic event is a critical task in many computer vision applications e.g. novel view synthesis and 3D reconstruction. Typically this information is implied, since recordings are made using the same timebase, or time-stamp information is embedded in the video streams. Recordings using consumer grade equipment do not contain this information; hence, there is a need to temporally synchronize signals using the visual information itself. Previous work in this area has either assumed good quality data with relatively simple dynamic content or the availability of precise camera geometry. In this paper, we propose a technique which exploits feature trajectories across views in a novel way, and specifically targets the kind of complex content found in consumer generated sports recordings, without assuming precise knowledge of fundamental matrices or homographies. Our method automatically selects the moving feature points in the two unsynchronized videos whose 2D trajectories can be best related, thereby helping to infer the synchronization index. We evaluate performance using a number of real recordings and show that synchronization can be achieved to within 1 sec, which is better than previous approaches. Copyright 2013 ACM.
Resumo:
We present a novel mixture of trees (MoT) graphical model for video segmentation. Each component in this mixture represents a tree structured temporal linkage between super-pixels from the first to the last frame of a video sequence. Our time-series model explicitly captures the uncertainty in temporal linkage between adjacent frames which improves segmentation accuracy. We provide a variational inference scheme for this model to estimate super-pixel labels and their confidences in nearly realtime. The efficacy of our approach is demonstrated via quantitative comparisons on the challenging SegTrack joint segmentation and tracking dataset [23].
Resumo:
In this paper a new half-flash architecture for high speed video ADC is presented. Based on a high speed single-way analog switch circuit, this architecture effectively reduces the number of elements. At the same lime no sacrifice of speed is needed compared with the normal half-flash structure.
Resumo:
The broadcast soccer video is usually recorded by one main camera, which is constantly gazing somewhere of playfield where a highlight event is happening. So the camera parameters and their variety have close relationship with semantic information of soccer video, and much interest has been caught in camera calibration for soccer video. The previous calibration methods either deal with goal scene, or have strict calibration conditions and high complexity. So, it does not properly handle the non-goal scene such as midfield or center-forward scene. In this paper, based on a new soccer field model, a field symbol extraction algorithm is proposed to extract the calibration information. Then a two-stage calibration approach is developed which can calibrate camera not only for goal scene but also for non-goal scene. The preliminary experimental results demonstrate its robustness and accuracy. (c) 2010 Elsevier B.V. All rights reserved.
Resumo:
It is important for practical application to design an effective and efficient metric for video quality. The most reliable way is by subjective evaluation. Thus, to design an objective metric by simulating human visual system (HVS) is quite reasonable and available. In this paper, the video quality assessment metric based on visual perception is proposed. Three-dimensional wavelet is utilized to decompose video and then extract features to mimic the multichannel structure of HVS. Spatio-temporal contrast sensitivity function (S-T CSF) is employed to weight coefficient obtained by three-dimensional wavelet to simulate nonlinearity feature of the human eyes. Perceptual threshold is exploited to obtain visual sensitive coefficients after S-T CSF filtered. Visual sensitive coefficients are normalized representation and then visual sensitive errors are calculated between reference and distorted video. Finally, temporal perceptual mechanism is applied to count values of video quality for reducing computational cost. Experimental results prove the proposed method outperforms the most existing methods and is comparable to LHS and PVQM.
Resumo:
Inspired by human visual cognition mechanism, this paper first presents a scene classification method based on an improved standard model feature. Compared with state-of-the-art efforts in scene classification, the newly proposed method is more robust, more selective, and of lower complexity. These advantages are demonstrated by two sets of experiments on both our own database and standard public ones. Furthermore, occlusion and disorder problems in scene classification in video surveillance are also first studied in this paper.
Resumo:
This paper consists of two major parts. First, we present the outline of a simple approach to very-low bandwidth video-conferencing system relying on an example-based hierarchical image compression scheme. In particular, we discuss the use of example images as a model, the number of required examples, faces as a class of semi-rigid objects, a hierarchical model based on decomposition into different time-scales, and the decomposition of face images into patches of interest. In the second part, we present several algorithms for image processing and animation as well as experimental evaluations. Among the original contributions of this paper is an automatic algorithm for pose estimation and normalization. We also review and compare different algorithms for finding the nearest neighbors in a database for a new input as well as a generalized algorithm for blending patches of interest in order to synthesize new images. Finally, we outline the possible integration of several algorithms to illustrate a simple model-based video-conference system.
Resumo:
Passive monitoring of large sites typically requires coordination between multiple cameras, which in turn requires methods for automatically relating events between distributed cameras. This paper tackles the problem of self-calibration of multiple cameras which are very far apart, using feature correspondences to determine the camera geometry. The key problem is finding such correspondences. Since the camera geometry and photometric characteristics vary considerably between images, one cannot use brightness and/or proximity constraints. Instead we apply planar geometric constraints to moving objects in the scene in order to align the scene"s ground plane across multiple views. We do not assume synchronized cameras, and we show that enforcing geometric constraints enables us to align the tracking data in time. Once we have recovered the homography which aligns the planar structure in the scene, we can compute from the homography matrix the 3D position of the plane and the relative camera positions. This in turn enables us to recover a homography matrix which maps the images to an overhead view. We demonstrate this technique in two settings: a controlled lab setting where we test the effects of errors in internal camera calibration, and an uncontrolled, outdoor setting in which the full procedure is applied to external camera calibration and ground plane recovery. In spite of noise in the internal camera parameters and image data, the system successfully recovers both planar structure and relative camera positions in both settings.