996 resultados para multiple cameras


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Audio-visualspeechrecognition, or the combination of visual lip-reading with traditional acoustic speechrecognition, has been previously shown to provide a considerable improvement over acoustic-only approaches in noisy environments, such as that present in an automotive cabin. The research presented in this paper will extend upon the established audio-visualspeechrecognition literature to show that further improvements in speechrecognition accuracy can be obtained when multiple frontal or near-frontal views of a speaker's face are available. A series of visualspeechrecognition experiments using a four-stream visual synchronous hidden Markov model (SHMM) are conducted on the four-camera AVICAR automotiveaudio-visualspeech database. We study the relative contribution between the side and central orientated cameras in improving visualspeechrecognition accuracy. Finally combination of the four visual streams with a single audio stream in a five-stream SHMM demonstrates a relative improvement of over 56% in word recognition accuracy when compared to the acoustic-only approach in the noisiest conditions of the AVICAR database.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present measurements of grid turbulence using 2D particle image velocimetry taken immediately downstream from the grid at a Reynolds number of Re M = 16500 where M is the rod spacing. A long field of view of 14M x 4M in the down- and cross-stream directions was achieved by stitching multiple cameras together. Two uniform biplanar grids were selected to have the same M and pressure drop but different rod diameter D and crosssection. A large data set (10 4 vector fields) was obtained to ensure good convergence of second-order statistics. Estimations of the dissipation rate ε of turbulent kinetic energy (TKE) were found to be sensitive to the number of meansquared velocity gradient terms included and not whether the turbulence was assumed to adhere to isotropy or axisymmetry. The resolution dependency of different turbulence statistics was assessed with a procedure that does not rely on the dissipation scale η. The streamwise evolution of the TKE components and ε was found to collapse across grids when the rod diameter was included in the normalisation. We argue that this should be the case between all regular grids when the other relevant dimensionless quantities are matched and the flow has become homogeneous across the stream. Two-point space correlation functions at x/M = 1 show evidence of complex wake interactions which exhibit a strong Reynolds number dependence. However, these changes in initial conditions disappear indicating rapid cross-stream homogenisation. On the other hand, isotropy was, as expected, not found to be established by x/M = 12 for any case studied. © Springer-Verlag 2012.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present a distributed, surveillance system that works in large and complex indoor environments. To track and recognize behaviors of people, we propose the use of the Abstract Hidden Markov Model (AHMM), which can be considered as an extension of the Hidden Markov Model (HMM), where the single Markov chain in the HMM is replaced by a hierarchy of Markov policies. In this policy hierarchy, each behavior can be represented as a policy at the corresponding level of abstraction. The noisy observations are handled in the same way as an HMM and an efficient Rao-Blackwellised particle filter method is used to compute the probabilities of the current policy at different levels of the hierarchy The novelty of the paper lies in the implementation of a scalable framework in the context of both the scale of behaviors and the size of the environment, making it ideal for distributed surveillance. The results of the system demonstrate the ability to answer queries about people's behaviors at different levels of details using multiple cameras in a large and complex indoor environment.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In surveillance systems for monitoring people behaviours, it is important to build systems that can adapt to the signatures of people's tasks and movements in the environment. At the same time, it is important to cope with noisy observations produced by a set of cameras with possibly different characteristics. In previous work, we have implemented a distributed surveillance system designed for complex indoor environments [1]. The system uses the Abstract Hidden Markov mEmory Model (AHMEM) for modelling and specifying complex human behaviours that can take place in the environment. Given a sequence of observations from a set of cameras, the system employs approximate probabilistic inference to compute the likelihood of different possible behaviours in real-time. This paper describes the techniques that can be used to learn the different camera noise models and the human movement models to be used in this system. The system is able to monitor and classify people behaviours as data is being gathered, and we provide classification results showing the system is able to identify behaviours of people from their movement signatures.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we present a distributed surveillance system that uses multiple cheap static cameras to track multiple people in indoor environments. The system has a set of Camera Processing Modules and a Central Module to coordinate the tracking tasks among the cameras. Since each object in the scene can be tracked by a number of cameras, the problem is how to choose the most appropriate camera for each object. We propose a novel algorithm to allocate objects to cameras using the object-to-camera distance while taking into account occlusion. The algorithm attempts to assign objects in the overlapping fields of view to the nearest camera which can see the object without occlusion. Experimental results show that the system can coordinate cameras to track people properly and can deal well with occlusion.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Accurate and efficient thermal-infrared (IR) camera calibration is important for advancing computer vision research within the thermal modality. This paper presents an approach for geometrically calibrating individual and multiple cameras in both the thermal and visible modalities. The proposed technique can be used to correct for lens distortion and to simultaneously reference both visible and thermal-IR cameras to a single coordinate frame. The most popular existing approach for the geometric calibration of thermal cameras uses a printed chessboard heated by a flood lamp and is comparatively inaccurate and difficult to execute. Additionally, software toolkits provided for calibration either are unsuitable for this task or require substantial manual intervention. A new geometric mask with high thermal contrast and not requiring a flood lamp is presented as an alternative calibration pattern. Calibration points on the pattern are then accurately located using a clustering-based algorithm which utilizes the maximally stable extremal region detector. This algorithm is integrated into an automatic end-to-end system for calibrating single or multiple cameras. The evaluation shows that using the proposed mask achieves a mean reprojection error up to 78% lower than that using a heated chessboard. The effectiveness of the approach is further demonstrated by using it to calibrate two multiple-camera multiple-modality setups. Source code and binaries for the developed software are provided on the project Web site.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Passive monitoring of large sites typically requires coordination between multiple cameras, which in turn requires methods for automatically relating events between distributed cameras. This paper tackles the problem of self-calibration of multiple cameras which are very far apart, using feature correspondences to determine the camera geometry. The key problem is finding such correspondences. Since the camera geometry and photometric characteristics vary considerably between images, one cannot use brightness and/or proximity constraints. Instead we apply planar geometric constraints to moving objects in the scene in order to align the scene"s ground plane across multiple views. We do not assume synchronized cameras, and we show that enforcing geometric constraints enables us to align the tracking data in time. Once we have recovered the homography which aligns the planar structure in the scene, we can compute from the homography matrix the 3D position of the plane and the relative camera positions. This in turn enables us to recover a homography matrix which maps the images to an overhead view. We demonstrate this technique in two settings: a controlled lab setting where we test the effects of errors in internal camera calibration, and an uncontrolled, outdoor setting in which the full procedure is applied to external camera calibration and ground plane recovery. In spite of noise in the internal camera parameters and image data, the system successfully recovers both planar structure and relative camera positions in both settings.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This paper proposes a method to locate and track people by combining evidence from multiple cameras using the homography constraint. The proposed method use foreground pixels from simple background subtraction to compute evidence of the location of people on a reference ground plane. The algorithm computes the amount of support that basically corresponds to the ""foreground mass"" above each pixel. Therefore, pixels that correspond to ground points have more support. The support is normalized to compensate for perspective effects and accumulated on the reference plane for all camera views. The detection of people on the reference plane becomes a search for regions of local maxima in the accumulator. Many false positives are filtered by checking the visibility consistency of the detected candidates against all camera views. The remaining candidates are tracked using Kalman filters and appearance models. Experimental results using challenging data from PETS`06 show good performance of the method in the presence of severe occlusion. Ground truth data also confirms the robustness of the method. (C) 2010 Elsevier B.V. All rights reserved.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The goal of image retrieval and matching is to find and locate object instances in images from a large-scale image database. While visual features are abundant, how to combine them to improve performance by individual features remains a challenging task. In this work, we focus on leveraging multiple features for accurate and efficient image retrieval and matching. We first propose two graph-based approaches to rerank initially retrieved images for generic image retrieval. In the graph, vertices are images while edges are similarities between image pairs. Our first approach employs a mixture Markov model based on a random walk model on multiple graphs to fuse graphs. We introduce a probabilistic model to compute the importance of each feature for graph fusion under a naive Bayesian formulation, which requires statistics of similarities from a manually labeled dataset containing irrelevant images. To reduce human labeling, we further propose a fully unsupervised reranking algorithm based on a submodular objective function that can be efficiently optimized by greedy algorithm. By maximizing an information gain term over the graph, our submodular function favors a subset of database images that are similar to query images and resemble each other. The function also exploits the rank relationships of images from multiple ranked lists obtained by different features. We then study a more well-defined application, person re-identification, where the database contains labeled images of human bodies captured by multiple cameras. Re-identifications from multiple cameras are regarded as related tasks to exploit shared information. We apply a novel multi-task learning algorithm using both low level features and attributes. A low rank attribute embedding is joint learned within the multi-task learning formulation to embed original binary attributes to a continuous attribute space, where incorrect and incomplete attributes are rectified and recovered. To locate objects in images, we design an object detector based on object proposals and deep convolutional neural networks (CNN) in view of the emergence of deep networks. We improve a Fast RCNN framework and investigate two new strategies to detect objects accurately and efficiently: scale-dependent pooling (SDP) and cascaded rejection classifiers (CRC). The SDP improves detection accuracy by exploiting appropriate convolutional features depending on the scale of input object proposals. The CRC effectively utilizes convolutional features and greatly eliminates negative proposals in a cascaded manner, while maintaining a high recall for true objects. The two strategies together improve the detection accuracy and reduce the computational cost.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The relationship between multiple cameras viewing the same scene may be discovered automatically by finding corresponding points in the two views and then solving for the camera geometry. In camera networks with sparsely placed cameras, low resolution cameras or in scenes with few distinguishable features it may be difficult to find a sufficient number of reliable correspondences from which to compute geometry. This paper presents a method for extracting a larger number of correspondences from an initial set of putative correspondences without any knowledge of the scene or camera geometry. The method may be used to increase the number of correspondences and make geometry computations possible in cases where existing methods have produced insufficient correspondences.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper we describe the recent development of a low-bandwidth wireless camera sensor network. We propose a simple, yet effective, network architecture which allows multiple cameras to be connected to the network and synchronize their communication schedules. Image compression of greater than 90% is performed at each node running on a local DSP coprocessor, resulting in nodes using 1/8th the energy compared to streaming uncompressed images. We briefly introduce the Fleck wireless node and the DSP/camera sensor, and then outline the network architecture and compression algorithm. The system is able to stream color QVGA images over the network to a base station at up to 2 frames per second. © 2007 IEEE.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Automated crowd counting has become an active field of computer vision research in recent years. Existing approaches are scene-specific, as they are designed to operate in the single camera viewpoint that was used to train the system. Real world camera networks often span multiple viewpoints within a facility, including many regions of overlap. This paper proposes a novel scene invariant crowd counting algorithm that is designed to operate across multiple cameras. The approach uses camera calibration to normalise features between viewpoints and to compensate for regions of overlap. This compensation is performed by constructing an 'overlap map' which provides a measure of how much an object at one location is visible within other viewpoints. An investigation into the suitability of various feature types and regression models for scene invariant crowd counting is also conducted. The features investigated include object size, shape, edges and keypoints. The regression models evaluated include neural networks, K-nearest neighbours, linear and Gaussian process regresion. Our experiments demonstrate that accurate crowd counting was achieved across seven benchmark datasets, with optimal performance observed when all features were used and when Gaussian process regression was used. The combination of scene invariance and multi camera crowd counting is evaluated by training the system on footage obtained from the QUT camera network and testing it on three cameras from the PETS 2009 database. Highly accurate crowd counting was observed with a mean relative error of less than 10%. Our approach enables a pre-trained system to be deployed on a new environment without any additional training, bringing the field one step closer toward a 'plug and play' system.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Novel computer vision techniques have been developed for automatic monitoring of crowed environments such as airports, railway stations and shopping malls. Using video feeds from multiple cameras, the techniques enable crowd counting, crowd flow monitoring, queue monitoring and abnormal event detection. The outcome of the research is useful for surveillance applications and for obtaining operational metrics to improve business efficiency.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This practice-led research is positioned within my ongoing enquiry into the dancer’s experience and role within the creative process. Gins and Arakawa (1997) and Keane (2007) speak to the unsatisfactory reliance on discipline boundaries, to describe the dynamic lived-experience of interaction. This theorising is of application to this project, which examines creative agency through the lens of Arakawa and Gins’ language prompt, boundary-swaying. In this project the boundaries of movement creator, performer and director overlap and blur through the use of improvisation and multiple cameras. All contributors are invested creatively and compositionally in the ensuing dynamic collaboration, wearing many hats, ‘conceiver, creative thinker, teacher and learner’ (McKechnie 2005, 93; Stevens & McKechnie 2005, 250). This project asked the question, how can the work of Arakawa and Gins to agitate, disrupt, and transform the modus operandi of creative practice between choreographer and practice, dancer and practice and choreographer and dancer? The use of Arakawa and Gins’ philosophy and language prompts within this project stimulated and positively influenced the established creative relationship of researcher and choreographer/artist in the following ways: • Foregrounded the dancers tacit knowledge, first-hand experience, know-how and embodied savviness; • Promoted artistic collaboration, illuminating new creative possibilities, choices and innovation; • Facilitated the distribution of creative authority and agency. This creative work was presented as part of the AG3 ONLINE: the Third International Arakawa and Gins - Architecture and Philosophy Conference. The work was vetted for inclusion by an international panel of examiners.