996 resultados para Multiple sparse cameras


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Automated virtual camera control has been widely used in animation and interactive virtual environments. We have developed a multiple sparse camera based free view video system prototype that allows users to control the position and orientation of a virtual camera, enabling the observation of a real scene in three dimensions (3D) from any desired viewpoint. Automatic camera control can be activated to follow selected objects by the user. Our method combines a simple geometric model of the scene composed of planes (virtual environment), augmented with visual information from the cameras and pre-computed tracking information of moving targets to generate novel perspective corrected 3D views of the virtual camera and moving objects. To achieve real-time rendering performance, view-dependent textured mapped billboards are used to render the moving objects at their correct locations and foreground masks are used to remove the moving objects from the projected video streams. The current prototype runs on a PC with a common graphics card and can generate virtual 2D views from three cameras of resolution 768 x 576 with several moving objects at about 11 fps. (C)2011 Elsevier Ltd. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Access All was performance produced following a three-month mentorship in web-based performance that I was commissioned to conduct for the performance company Igneous. This live, triple-site performance event for three performers in three remote venues was specifically designed for presentation at Access Grid Nodes - conference rooms located around the globe equipped with a high end, open source computer teleconferencing technology that allowed multiple nodes to cross-connect with each other. Whilst each room was setup somewhat differently they all deployed the same basic infrastructre of multiple projectors, cameras, and sound as well as a reconfigurable floorspace. At that time these relatively formal setups imposed a clear series of limitations in terms of software capabilities and basic infrastructure and so there was much interest in understanding how far its capabilities might be pushed.----- Numerous performance experiments were undertaken between three Access Grid nodes in QUT Brisbane, VISLAB Sydney and Manchester Supercomputing Centre, England, culminating in the public performance staged simultaneously between the sites with local audiences at each venue and others online. Access All was devised in collaboration with interdisciplinary performance company Bonemap, Kelli Dipple (Interarts curator, Tate Modern London) and Mike Stubbs British curator and Director of FACT (Liverpool).----- This period of research and development was instigated and shaped by a public lecture I had earlier delivered in Sydney for the ‘Global Access Grid Network, Super Computing Global Conference’ entitled 'Performance Practice across Electronic Networks'. The findings of this work went on to inform numerous future networked and performative works produced from 2002 onwards.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Numerous algorithms have been proposed recently for sparse signal recovery in Compressed Sensing (CS). In practice, the number of measurements can be very limited due to the nature of the problem and/or the underlying statistical distribution of the non-zero elements of the sparse signal may not be known a priori. It has been observed that the performance of any sparse signal recovery algorithm depends on these factors, which makes the selection of a suitable sparse recovery algorithm difficult. To take advantage in such situations, we propose to use a fusion framework using which we employ multiple sparse signal recovery algorithms and fuse their estimates to get a better estimate. Theoretical results justifying the performance improvement are shown. The efficacy of the proposed scheme is demonstrated by Monte Carlo simulations using synthetic sparse signals and ECG signals selected from MIT-BIH database.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Although many sparse recovery algorithms have been proposed recently in compressed sensing (CS), it is well known that the performance of any sparse recovery algorithm depends on many parameters like dimension of the sparse signal, level of sparsity, and measurement noise power. It has been observed that a satisfactory performance of the sparse recovery algorithms requires a minimum number of measurements. This minimum number is different for different algorithms. In many applications, the number of measurements is unlikely to meet this requirement and any scheme to improve performance with fewer measurements is of significant interest in CS. Empirically, it has also been observed that the performance of the sparse recovery algorithms also depends on the underlying statistical distribution of the nonzero elements of the signal, which may not be known a priori in practice. Interestingly, it can be observed that the performance degradation of the sparse recovery algorithms in these cases does not always imply a complete failure. In this paper, we study this scenario and show that by fusing the estimates of multiple sparse recovery algorithms, which work with different principles, we can improve the sparse signal recovery. We present the theoretical analysis to derive sufficient conditions for performance improvement of the proposed schemes. We demonstrate the advantage of the proposed methods through numerical simulations for both synthetic and real signals.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Tracking applications provide real time on-site information that can be used to detect travel path conflicts, calculate crew productivity and eliminate unnecessary processes at the site. This paper presents the validation of a novel vision based tracking methodology at the Egnatia Odos Motorway in Thessaloniki, Greece. Egnatia Odos is a motorway that connects Turkey with Italy through Greece. Its multiple open construction sites serves as an ideal multi-site test bed for validating construction site tracking methods. The vision based tracking methodology uses video cameras and computer algorithms to calculate the 3D position of project related entities (e.g. personnel, materials and equipment) in construction sites. The approach provides an unobtrusive, inexpensive way of effectively identifying and tracking the 3D location of entities. The process followed in this study starts by acquiring video data from multiple synchronous cameras at several large scale project sites of Egnatia Odos, such as tunnels, interchanges and bridges under construction. Subsequent steps include the evaluation of the collected data and finally, performing the 3D tracking operations on selected entities (heavy equipment and personnel). The accuracy and precision of the method's results is evaluated by comparing it with the actual 3D position of the object, thus assessing the 3D tracking method's effectiveness.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We present an image-based approach to infer 3D structure parameters using a probabilistic "shape+structure'' model. The 3D shape of a class of objects may be represented by sets of contours from silhouette views simultaneously observed from multiple calibrated cameras. Bayesian reconstructions of new shapes can then be estimated using a prior density constructed with a mixture model and probabilistic principal components analysis. We augment the shape model to incorporate structural features of interest; novel examples with missing structure parameters may then be reconstructed to obtain estimates of these parameters. Model matching and parameter inference are done entirely in the image domain and require no explicit 3D construction. Our shape model enables accurate estimation of structure despite segmentation errors or missing views in the input silhouettes, and works even with only a single input view. Using a dataset of thousands of pedestrian images generated from a synthetic model, we can perform accurate inference of the 3D locations of 19 joints on the body based on observed silhouette contours from real images.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Réalisé en cotutelle avec le laboratoire M2S de Rennes 2

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We present a statistical image-based shape + structure model for Bayesian visual hull reconstruction and 3D structure inference. The 3D shape of a class of objects is represented by sets of contours from silhouette views simultaneously observed from multiple calibrated cameras. Bayesian reconstructions of new shapes are then estimated using a prior density constructed with a mixture model and probabilistic principal components analysis. We show how the use of a class-specific prior in a visual hull reconstruction can reduce the effect of segmentation errors from the silhouette extraction process. The proposed method is applied to a data set of pedestrian images, and improvements in the approximate 3D models under various noise conditions are shown. We further augment the shape model to incorporate structural features of interest; unknown structural parameters for a novel set of contours are then inferred via the Bayesian reconstruction process. Model matching and parameter inference are done entirely in the image domain and require no explicit 3D construction. Our shape model enables accurate estimation of structure despite segmentation errors or missing views in the input silhouettes, and works even with only a single input view. Using a data set of thousands of pedestrian images generated from a synthetic model, we can accurately infer the 3D locations of 19 joints on the body based on observed silhouette contours from real images.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In applications such as tracking and surveillance in large spatial environments, there is a need for representing dynamic and noisy data and at the same time dealing with them at different levels of detail. In the spatial domain, there has been work dealing with these two issues separately, however, there is no existing common framework for dealing with both of them. In this paper, we propose a new representation framework called the Layered Dynamic Probabilistic Network (LDPN), a special type of Dynamic Probabilistic Network (DPN), capable of handling uncertainty and representing spatial data at various levels of detail. The framework is thus particularly suited to applications in wide-area environments which are characterised by large region size, complex spatial layout and multiple sensors/cameras. For example, a building has three levels: entry/exit to the building, entry/exit between rooms and moving within rooms. To avoid the problem of a relatively large state space associated with a large spatial environment, the LDPN explicitly encodes the hierarchy of connected spatial locations, making it scalable to the size of the environment being modelled. There are three main advantages of the LDPN. First, the reduction in state space makes it suitable for dealing with wide area surveillance involving multiple sensors. Second, it offers a hierarchy of intervals for indexing temporal data. Lastly, the explicit representation of intermediate sub-goals allows for the extension of the framework to easily represent group interactions by allowing coupling between sub-goal layers of different individuals or objects. We describe an adaptation of the likelihood sampling inference scheme for the LDPN, and illustrate its use in a hypothetical surveillance scenario.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper, we consider the problem of tracking an object and predicting the object's future trajectory in a wide-area environment, with complex spatial layout and the use of multiple sensors/cameras. To solve this problem, there is a need for representing the dynamic and noisy data in the tracking tasks, and dealing with them at different levels of detail. We employ the Abstract Hidden Markov Models (AHMM), an extension of the well-known Hidden Markov Model (HMM) and a special type of Dynamic Probabilistic Network (DPN), as our underlying representation framework. The AHMM allows us to explicitly encode the hierarchy of connected spatial locations, making it scalable to the size of the environment being modeled. We describe an application for tracking human movement in an office-like spatial layout where the AHMM is used to track and predict the evolution of object trajectories at different levels of detail.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper presents different application scenarios for which the registration of sub-sequence reconstructions or multi-camera reconstructions is essential for successful camera motion estimation and 3D reconstruction from video. The registration is achieved by merging unconnected feature point tracks between the reconstructions. One application is drift removal for sequential camera motion estimation of long sequences. The state-of-the-art in drift removal is to apply a RANSAC approach to find unconnected feature point tracks. In this paper an alternative spectral algorithm for pairwise matching of unconnected feature point tracks is used. It is then shown that the algorithms can be combined and applied to novel scenarios where independent camera motion estimations must be registered into a common global coordinate system. In the first scenario multiple moving cameras, which capture the same scene simultaneously, are registered. A second new scenario occurs in situations where the tracking of feature points during sequential camera motion estimation fails completely, e.g., due to large occluding objects in the foreground, and the unconnected tracks of the independent reconstructions must be merged. In the third scenario image sequences of the same scene, which are captured under different illuminations, are registered. Several experiments with challenging real video sequences demonstrate that the presented techniques work in practice.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Camera calibration information is required in order for multiple camera networks to deliver more than the sum of many single camera systems. Methods exist for manually calibrating cameras with high accuracy. Manually calibrating networks with many cameras is, however, time consuming, expensive and impractical for networks that undergo frequent change. For this reason, automatic calibration techniques have been vigorously researched in recent years. Fully automatic calibration methods depend on the ability to automatically find point correspondences between overlapping views. In typical camera networks, cameras are placed far apart to maximise coverage. This is referred to as a wide base-line scenario. Finding sufficient correspondences for camera calibration in wide base-line scenarios presents a significant challenge. This thesis focuses on developing more effective and efficient techniques for finding correspondences in uncalibrated, wide baseline, multiple-camera scenarios. The project consists of two major areas of work. The first is the development of more effective and efficient view covariant local feature extractors. The second area involves finding methods to extract scene information using the information contained in a limited set of matched affine features. Several novel affine adaptation techniques for salient features have been developed. A method is presented for efficiently computing the discrete scale space primal sketch of local image features. A scale selection method was implemented that makes use of the primal sketch. The primal sketch-based scale selection method has several advantages over the existing methods. It allows greater freedom in how the scale space is sampled, enables more accurate scale selection, is more effective at combining different functions for spatial position and scale selection, and leads to greater computational efficiency. Existing affine adaptation methods make use of the second moment matrix to estimate the local affine shape of local image features. In this thesis, it is shown that the Hessian matrix can be used in a similar way to estimate local feature shape. The Hessian matrix is effective for estimating the shape of blob-like structures, but is less effective for corner structures. It is simpler to compute than the second moment matrix, leading to a significant reduction in computational cost. A wide baseline dense correspondence extraction system, called WiDense, is presented in this thesis. It allows the extraction of large numbers of additional accurate correspondences, given only a few initial putative correspondences. It consists of the following algorithms: An affine region alignment algorithm that ensures accurate alignment between matched features; A method for extracting more matches in the vicinity of a matched pair of affine features, using the alignment information contained in the match; An algorithm for extracting large numbers of highly accurate point correspondences from an aligned pair of feature regions. Experiments show that the correspondences generated by the WiDense system improves the success rate of computing the epipolar geometry of very widely separated views. This new method is successful in many cases where the features produced by the best wide baseline matching algorithms are insufficient for computing the scene geometry.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Audio-visualspeechrecognition, or the combination of visual lip-reading with traditional acoustic speechrecognition, has been previously shown to provide a considerable improvement over acoustic-only approaches in noisy environments, such as that present in an automotive cabin. The research presented in this paper will extend upon the established audio-visualspeechrecognition literature to show that further improvements in speechrecognition accuracy can be obtained when multiple frontal or near-frontal views of a speaker's face are available. A series of visualspeechrecognition experiments using a four-stream visual synchronous hidden Markov model (SHMM) are conducted on the four-camera AVICAR automotiveaudio-visualspeech database. We study the relative contribution between the side and central orientated cameras in improving visualspeechrecognition accuracy. Finally combination of the four visual streams with a single audio stream in a five-stream SHMM demonstrates a relative improvement of over 56% in word recognition accuracy when compared to the acoustic-only approach in the noisiest conditions of the AVICAR database.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper considers the problem of identifying the footprints of communication of multiple transmitters in a given geographical area. To do this, a number of sensors are deployed at arbitrary but known locations in the area, and their individual decisions regarding the presence or absence of the transmitters' signal are combined at a fusion center to reconstruct the spatial spectral usage map. One straightforward scheme to construct this map is to query each of the sensors and cluster the sensors that detect the primary's signal. However, using the fact that a typical transmitter footprint map is a sparse image, two novel compressive sensing based schemes are proposed, which require significantly fewer number of transmissions compared to the querying scheme. A key feature of the proposed schemes is that the measurement matrix is constructed from a pseudo-random binary phase shift applied to the decision of each sensor prior to transmission. The measurement matrix is thus a binary ensemble which satisfies the restricted isometry property. The number of measurements needed for accurate footprint reconstruction is determined using compressive sampling theory. The three schemes are compared through simulations in terms of a performance measure that quantifies the accuracy of the reconstructed spatial spectral usage map. It is found that the proposed sparse reconstruction technique-based schemes significantly outperform the round-robin scheme.