954 resultados para Video-camera


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Topographic structural complexity of a reef is highly correlated to coral growth rates, coral cover and overall levels of biodiversity, and is therefore integral in determining ecological processes. Modeling these processes commonly includes measures of rugosity obtained from a wide range of different survey techniques that often fail to capture rugosity at different spatial scales. Here we show that accurate estimates of rugosity can be obtained from video footage captured using underwater video cameras (i.e., monocular video). To demonstrate the accuracy of our method, we compared the results to in situ measurements of a 2m x 20m area of forereef from Glovers Reef atoll in Belize. Sequential pairs of images were used to compute fine scale bathymetric reconstructions of the reef substrate from which precise measurements of rugosity and reef topographic structural complexity can be derived across multiple spatial scales. To achieve accurate bathymetric reconstructions from uncalibrated monocular video, the position of the camera for each image in the video sequence and the intrinsic parameters (e.g., focal length) must be computed simultaneously. We show that these parameters can be often determined when the data exhibits parallax-type motion, and that rugosity and reef complexity can be accurately computed from existing video sequences taken from any type of underwater camera from any reef habitat or location. This technique provides an infinite array of possibilities for future coral reef research by providing a cost-effective and automated method of determining structural complexity and rugosity in both new and historical video surveys of coral reefs.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Distributed Wireless Smart Camera (DWSC) network is a special type of Wireless Sensor Network (WSN) that processes captured images in a distributed manner. While image processing on DWSCs sees a great potential for growth, with its applications possessing a vast practical application domain such as security surveillance and health care, it suffers from tremendous constraints. In addition to the limitations of conventional WSNs, image processing on DWSCs requires more computational power, bandwidth and energy that presents significant challenges for large scale deployments. This dissertation has developed a number of algorithms that are highly scalable, portable, energy efficient and performance efficient, with considerations of practical constraints imposed by the hardware and the nature of WSN. More specifically, these algorithms tackle the problems of multi-object tracking and localisation in distributed wireless smart camera net- works and optimal camera configuration determination. Addressing the first problem of multi-object tracking and localisation requires solving a large array of sub-problems. The sub-problems that are discussed in this dissertation are calibration of internal parameters, multi-camera calibration for localisation and object handover for tracking. These topics have been covered extensively in computer vision literatures, however new algorithms must be invented to accommodate the various constraints introduced and required by the DWSC platform. A technique has been developed for the automatic calibration of low-cost cameras which are assumed to be restricted in their freedom of movement to either pan or tilt movements. Camera internal parameters, including focal length, principal point, lens distortion parameter and the angle and axis of rotation, can be recovered from a minimum set of two images of the camera, provided that the axis of rotation between the two images goes through the camera's optical centre and is parallel to either the vertical (panning) or horizontal (tilting) axis of the image. For object localisation, a novel approach has been developed for the calibration of a network of non-overlapping DWSCs in terms of their ground plane homographies, which can then be used for localising objects. In the proposed approach, a robot travels through the camera network while updating its position in a global coordinate frame, which it broadcasts to the cameras. The cameras use this, along with the image plane location of the robot, to compute a mapping from their image planes to the global coordinate frame. This is combined with an occupancy map generated by the robot during the mapping process to localised objects moving within the network. In addition, to deal with the problem of object handover between DWSCs of non-overlapping fields of view, a highly-scalable, distributed protocol has been designed. Cameras that follow the proposed protocol transmit object descriptions to a selected set of neighbours that are determined using a predictive forwarding strategy. The received descriptions are then matched at the subsequent camera on the object's path using a probability maximisation process with locally generated descriptions. The second problem of camera placement emerges naturally when these pervasive devices are put into real use. The locations, orientations, lens types etc. of the cameras must be chosen in a way that the utility of the network is maximised (e.g. maximum coverage) while user requirements are met. To deal with this, a statistical formulation of the problem of determining optimal camera configurations has been introduced and a Trans-Dimensional Simulated Annealing (TDSA) algorithm has been proposed to effectively solve the problem.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

At the highest level of competitive sport, nearly all performances of athletes (both training and competitive) are chronicled using video. Video is then often viewed by expert coaches/analysts who then manually label important performance indicators to gauge performance. Stroke-rate and pacing are important performance measures in swimming, and these are previously digitised manually by a human. This is problematic as annotating large volumes of video can be costly, and time-consuming. Further, since it is difficult to accurately estimate the position of the swimmer at each frame, measures such as stroke rate are generally aggregated over an entire swimming lap. Vision-based techniques which can automatically, objectively and reliably track the swimmer and their location can potentially solve these issues and allow for large-scale analysis of a swimmer across many videos. However, the aquatic environment is challenging due to fluctuations in scene from splashes, reflections and because swimmers are frequently submerged at different points in a race. In this paper, we temporally segment races into distinct and sequential states, and propose a multimodal approach which employs individual detectors tuned to each race state. Our approach allows the swimmer to be located and tracked smoothly in each frame despite a diverse range of constraints. We test our approach on a video dataset compiled at the 2012 Australian Short Course Swimming Championships.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper discusses the idea and demonstrates an early prototype of a novel method of interacting with security surveillance footage using natural user interfaces in place of traditional mouse and keyboard interaction. Current surveillance monitoring stations and systems provide the user with a vast array of video feeds from multiple locations on a video wall, relying on the user’s ability to distinguish locations of the live feeds from experience or list based key-value pair of location and camera IDs. During an incident, this current method of interaction may cause the user to spend increased amounts time obtaining situational and location awareness, which is counter-productive. The system proposed in this paper demonstrates how a multi-touch screen and natural interaction can enable the surveillance monitoring station users to quickly identify the location of a security camera and efficiently respond to an incident.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The location of previously unseen and unregistered individuals in complex camera networks from semantic descriptions is a time consuming and often inaccurate process carried out by human operators, or security staff on the ground. To promote the development and evaluation of automated semantic description based localisation systems, we present a new, publicly available, unconstrained 110 sequence database, collected from 6 stationary cameras. Each sequence contains detailed semantic information for a single search subject who appears in the clip (gender, age, height, build, hair and skin colour, clothing type, texture and colour), and between 21 and 290 frames for each clip are annotated with the target subject location (over 11,000 frames are annotated in total). A novel approach for localising a person given a semantic query is also proposed and demonstrated on this database. The proposed approach incorporates clothing colour and type (for clothing worn below the waist), as well as height and build to detect people. A method to assess the quality of candidate regions, as well as a symmetry driven approach to aid in modelling clothing on the lower half of the body, is proposed within this approach. An evaluation on the proposed dataset shows that a relative improvement in localisation accuracy of up to 21 is achieved over the baseline technique.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This sensory ethnography explores the affordances and constraints of multimodal design to represent emotions and appraisal associated with experiencing local places. Digital video production, walking with the camera, and the use of a think-aloud protocol to reflect on the videos, provided an opportunity for the primary school children to represent their emotions and appraisal of places multimodally. Applying a typology from Martin and White's (2005) framework for the Language of Evaluation, children's multimodal emotional responses to places in this study tended toward happiness, security, and satisfaction. The findings demonstrate an explicit connection between children's emotions in response to local places through video, while highlighting the potential for teachers to use digital filmmaking to allow children to reflect actively on their placed experiences and represent their emotional reactions to places through multiple modes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

It is commonplace to use digital video cameras in robotic applications. These cameras have built-in exposure control but they do not have any knowledge of the environment, the lens being used, the important areas of the image and do not always produce optimal image exposure. Therefore, it is desirable and often necessary to control the exposure off the camera. In this paper we present a scheme for exposure control which enables the user application to determine the area of interest. The proposed scheme introduces an intermediate transparent layer between the camera and the user application which combines the information from these for optimal exposure production. We present results from indoor and outdoor scenarios using directional and fish-eye lenses showing the performance and advantages of this framework.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Viewer interests, evoked by video content, can potentially identify the highlights of the video. This paper explores the use of facial expressions (FE) and heart rate (HR) of viewers captured using camera and non-strapped sensor for identifying interesting video segments. The data from ten subjects with three videos showed that these signals are viewer dependent and not synchronized with the video contents. To address this issue, new algorithms are proposed to effectively combine FE and HR signals for identifying the time when viewer interest is potentially high. The results show that, compared with subjective annotation and match report highlights, ‘non-neutral’ FE and ‘relatively higher and faster’ HR is able to capture 60%-80% of goal, foul, and shot-on-goal soccer video events. FE is found to be more indicative than HR of viewer’s interests, but the fusion of these two modalities outperforms each of them.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

H. 264/advanced video coding surveillance video encoders use the Skip mode specified by the standard to reduce bandwidth. They also use multiple frames as reference for motion-compensated prediction. In this paper, we propose two techniques to reduce the bandwidth and computational cost of static camera surveillance video encoders without affecting detection and recognition performance. A spatial sampler is proposed to sample pixels that are segmented using a Gaussian mixture model. Modified weight updates are derived for the parameters of the mixture model to reduce floating point computations. A storage pattern of the parameters in memory is also modified to improve cache performance. Skip selection is performed using the segmentation results of the sampled pixels. The second contribution is a low computational cost algorithm to choose the reference frames. The proposed reference frame selection algorithm reduces the cost of coding uncovered background regions. We also study the number of reference frames required to achieve good coding efficiency. Distortion over foreground pixels is measured to quantify the performance of the proposed techniques. Experimental results show bit rate savings of up to 94.5% over methods proposed in literature on video surveillance data sets. The proposed techniques also provide up to 74.5% reduction in compression complexity without increasing the distortion over the foreground regions in the video sequence.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

There is a clear need to develop fisheries independent methods to quantify individual sizes, density, and three dimensional characteristics of reef fish spawning aggregations for use in population assessments and to provide critical baseline data on reproductive life history of exploited populations. We designed, constructed, calibrated, and applied an underwater stereo-video system to estimate individual sizes and three dimensional (3D) positions of Nassau grouper (Epinephelus striatus) at a spawning aggregation site located on a reef promontory on the western edge of Little Cayman Island, Cayman Islands, BWI, on 23 January 2003. The system consists of two free-running camcorders mounted on a meter-long bar and supported by a SCUBA diver. Paired video “stills” were captured, and nose and tail of individual fish observed in the field of view of both cameras were digitized using image analysis software. Conversion of these two dimensional screen coordinates to 3D coordinates was achieved through a matrix inversion algorithm and calibration data. Our estimate of mean total length (58.5 cm, n = 29) was in close agreement with estimated lengths from a hydroacoustic survey and from direct measures of fish size using visual census techniques. We discovered a possible bias in length measures using the video method, most likely arising from some fish orientations that were not perpendicular with respect to the optical axis of the camera system. We observed 40 individuals occupying a volume of 33.3 m3, resulting in a concentration of 1.2 individuals m–3 with a mean (SD) nearest neighbor distance of 70.0 (29.7) cm. We promote the use of roving diver stereo-videography as a method to assess the size distribution, density, and 3D spatial structure of fish spawning aggregations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A stereo-video baited camera system (BotCam) has been developed as a fishery-independent tool to monitor and study deepwater fish species and their habitat. During testing, BotCam was deployed primarily in water depths between 100 and 300 m for an assessment of its use in monitoring and studying Hawaiian bottomfish species. Details of the video analyses and data from the pilot study with BotCam in Hawai`i are presented. Multibeam bathymetry and backscatter data were used to delineate bottomfish habitat strata, and a stratified random sampling design was used for BotCam deployment locations. Video data were analyzed to assess relative fish abundance and to measure f ish size composition. Results corroborate published depth ranges and zones of the target species, as well as their habitat preferences. The results indicate that BotCam is a promising tool for monitoring and studying demersal fish populations associated with deepwater habitats to a depth of 300 m, at mesohabitat scales. BotCam is a flexible, nonextractive, and economical means to better understand deepwater ecosystems and improve science-based ecosystem approaches to management.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We describe the application of two types of stereo camera systems in fisheries research, including the design, calibration, analysis techniques, and precision of the data obtained with these systems. The first is a stereo video system deployed by using a quick-responding winch with a live feed to provide species- and size- composition data adequate to produce acoustically based biomass estimates of rockfish. This system was tested on the eastern Bering Sea slope where rockfish were measured. Rockfish sizes were similar to those sampled with a bottom trawl and the relative error in multiple measurements of the same rockfish in multiple still-frame images was small. Measurement errors of up to 5.5% were found on a calibration target of known size. The second system consisted of a pair of still-image digital cameras mounted inside a midwater trawl. Processing of the stereo images allowed fish length, fish orientation in relation to the camera platform, and relative distance of the fish to the trawl netting to be determined. The video system was useful for surveying fish in Alaska, but it could also be used broadly in other situations where it is difficult to obtain species-composition or size-composition information. Likewise, the still-image system could be used for fisheries research to obtain data on size, position, and orientation of fish.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The commercial far-range (>10m) infrastructure spatial data collection methods are not completely automated. They need significant amount of manual post-processing work and in some cases, the equipment costs are significant. This paper presents a method that is the first step of a stereo videogrammetric framework and holds the promise to address these issues. Under this method, video streams are initially collected from a calibrated set of two video cameras. For each pair of simultaneous video frames, visual feature points are detected and their spatial coordinates are then computed. The result, in the form of a sparse 3D point cloud, is the basis for the next steps in the framework (i.e., camera motion estimation and dense 3D reconstruction). A set of data, collected from an ongoing infrastructure project, is used to show the merits of the method. Comparison with existing tools is also shown, to indicate the performance differences of the proposed method in the level of automation and the accuracy of results.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Vision-based object detection has been introduced in construction for recognizing and locating construction entities in on-site camera views. It can provide spatial locations of a large number of entities, which is beneficial in large-scale, congested construction sites. However, even a few false detections prevent its practical applications. In resolving this issue, this paper presents a novel hybrid method for locating construction equipment that fuses the function of detection and tracking algorithms. This method detects construction equipment in the video view by taking advantage of entities' motion, shape, and color distribution. Background subtraction, Haar-like features, and eigen-images are used for motion, shape, and color information, respectively. A tracking algorithm steps in the process to make up for the false detections. False detections are identified by catching drastic changes in object size and appearance. The identified false detections are replaced with tracking results. Preliminary experiments show that the combination with tracking has the potential to enhance the detection performance.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Temporal synchronization of multiple video recordings of the same dynamic event is a critical task in many computer vision applications e.g. novel view synthesis and 3D reconstruction. Typically this information is implied, since recordings are made using the same timebase, or time-stamp information is embedded in the video streams. Recordings using consumer grade equipment do not contain this information; hence, there is a need to temporally synchronize signals using the visual information itself. Previous work in this area has either assumed good quality data with relatively simple dynamic content or the availability of precise camera geometry. In this paper, we propose a technique which exploits feature trajectories across views in a novel way, and specifically targets the kind of complex content found in consumer generated sports recordings, without assuming precise knowledge of fundamental matrices or homographies. Our method automatically selects the moving feature points in the two unsynchronized videos whose 2D trajectories can be best related, thereby helping to infer the synchronization index. We evaluate performance using a number of real recordings and show that synchronization can be achieved to within 1 sec, which is better than previous approaches. Copyright 2013 ACM.