916 resultados para Computer vision
Resumo:
It is possible for the visual attention characteristics of a person to be exploited as a biometric for authentication or identification of individual viewers. The visual attention characteristics of a person can be easily monitored by tracking the gaze of a viewer during the presentation of a known or unknown visual scene. The positions and sequences of gaze locations during viewing may be determined by overt (conscious) or covert (sub-conscious) viewing behaviour. This paper presents a method to authenticate individuals using their covert viewing behaviour, thus yielding a unique behavioural biometric. A method to quantify the spatial and temporal patterns established by the viewer for their covert behaviour is proposed utilsing a principal component analysis technique called `eigenGaze'. Experimental results suggest that it is possible to capture the unique visual attention characteristics of a person to provide a simple behavioural biometric.
Resumo:
In a clinical setting, pain is reported either through patient self-report or via an observer. Such measures are problematic as they are: 1) subjective, and 2) give no specific timing information. Coding pain as a series of facial action units (AUs) can avoid these issues as it can be used to gain an objective measure of pain on a frame-by-frame basis. Using video data from patients with shoulder injuries, in this paper, we describe an active appearance model (AAM)-based system that can automatically detect the frames in video in which a patient is in pain. This pain data set highlights the many challenges associated with spontaneous emotion detection, particularly that of expression and head movement due to the patient's reaction to pain. In this paper, we show that the AAM can deal with these movements and can achieve significant improvements in both the AU and pain detection performance compared to the current-state-of-the-art approaches which utilize similarity-normalized appearance features only.
Resumo:
Detection of Region of Interest (ROI) in a video leads to more efficient utilization of bandwidth. This is because any ROIs in a given frame can be encoded in higher quality than the rest of that frame, with little or no degradation of quality from the perception of the viewers. Consequently, it is not necessary to uniformly encode the whole video in high quality. One approach to determine ROIs is to use saliency detectors to locate salient regions. This paper proposes a methodology for obtaining ground truth saliency maps to measure the effectiveness of ROI detection by considering the role of user experience during the labelling process of such maps. User perceptions can be captured and incorporated into the definition of salience in a particular video, taking advantage of human visual recall within a given context. Experiments with two state-of-the-art saliency detectors validate the effectiveness of this approach to validating visual saliency in video. This paper will provide the relevant datasets associated with the experiments.
Robust mean super-resolution for less cooperative NIR iris recognition at a distance and on the move
Resumo:
Less cooperative iris identification systems at a distance and on the move often suffers from poor resolution. The lack of pixel resolution significantly degrades the iris recognition performance. Super-resolution has been considered to enhance resolution of iris images. This paper proposes a pixelwise super-resolution technique to reconstruct a high resolution iris image from a video sequence of an eye. A novel fusion approach is proposed to incorporate information details from multiple frames using robust mean. Experiments on the MBGC NIR portal database show the validity of the proposed approach in comparison with other resolution enhancement techniques.
Resumo:
Eigen-based techniques and other monolithic approaches to face recognition have long been a cornerstone in the face recognition community due to the high dimensionality of face images. Eigen-face techniques provide minimal reconstruction error and limit high-frequency content while linear discriminant-based techniques (fisher-faces) allow the construction of subspaces which preserve discriminatory information. This paper presents a frequency decomposition approach for improved face recognition performance utilising three well-known techniques: Wavelets; Gabor / Log-Gabor; and the Discrete Cosine Transform. Experimentation illustrates that frequency domain partitioning prior to dimensionality reduction increases the information available for classification and greatly increases face recognition performance for both eigen-face and fisher-face approaches.
Resumo:
This paper proposes a semi-supervised intelligent visual surveillance system to exploit the information from multi-camera networks for the monitoring of people and vehicles. Modules are proposed to perform critical surveillance tasks including: the management and calibration of cameras within a multi-camera network; tracking of objects across multiple views; recognition of people utilising biometrics and in particular soft-biometrics; the monitoring of crowds; and activity recognition. Recent advances in these computer vision modules and capability gaps in surveillance technology are also highlighted.
Resumo:
We describe a novel two stage approach to object localization and tracking using a network of wireless cameras and a mobile robot. In the first stage, a robot travels through the camera network while updating its position in a global coordinate frame which it broadcasts to the cameras. The cameras use this information, along with image plane location of the robot, to compute a mapping from their image planes to the global coordinate frame. This is combined with an occupancy map generated by the robot during the mapping process to track the objects. We present results with a nine node indoor camera network to demonstrate that this approach is feasible and offers acceptable level of accuracy in terms of object locations.
Resumo:
Object identification and tracking have become critical for automated on-site construction safety assessment. The primary objective of this paper is to present the development of a testbed to analyze the impact of object identification and tracking errors caused by data collection devices and algorithms used for safety assessment. The testbed models workspaces for earthmoving operations and simulates safety-related violations, including speed limit violations, access violations to dangerous areas, and close proximity violations between heavy machinery. Three different cases were analyzed based on actual earthmoving operations conducted at a limestone quarry. Using the testbed, the impacts of device and algorithm errors were investigated for safety planning purposes.
Resumo:
Video surveillance technology, based on Closed Circuit Television (CCTV) cameras, is one of the fastest growing markets in the field of security technologies. However, the existing video surveillance systems are still not at a stage where they can be used for crime prevention. The systems rely heavily on human observers and are therefore limited by factors such as fatigue and monitoring capabilities over long periods of time. To overcome this limitation, it is necessary to have “intelligent” processes which are able to highlight the salient data and filter out normal conditions that do not pose a threat to security. In order to create such intelligent systems, an understanding of human behaviour, specifically, suspicious behaviour is required. One of the challenges in achieving this is that human behaviour can only be understood correctly in the context in which it appears. Although context has been exploited in the general computer vision domain, it has not been widely used in the automatic suspicious behaviour detection domain. So, it is essential that context has to be formulated, stored and used by the system in order to understand human behaviour. Finally, since surveillance systems could be modeled as largescale data stream systems, it is difficult to have a complete knowledge base. In this case, the systems need to not only continuously update their knowledge but also be able to retrieve the extracted information which is related to the given context. To address these issues, a context-based approach for detecting suspicious behaviour is proposed. In this approach, contextual information is exploited in order to make a better detection. The proposed approach utilises a data stream clustering algorithm in order to discover the behaviour classes and their frequency of occurrences from the incoming behaviour instances. Contextual information is then used in addition to the above information to detect suspicious behaviour. The proposed approach is able to detect observed, unobserved and contextual suspicious behaviour. Two case studies using video feeds taken from CAVIAR dataset and Z-block building, Queensland University of Technology are presented in order to test the proposed approach. From these experiments, it is shown that by using information about context, the proposed system is able to make a more accurate detection, especially those behaviours which are only suspicious in some contexts while being normal in the others. Moreover, this information give critical feedback to the system designers to refine the system. Finally, the proposed modified Clustream algorithm enables the system to both continuously update the system’s knowledge and to effectively retrieve the information learned in a given context. The outcomes from this research are: (a) A context-based framework for automatic detecting suspicious behaviour which can be used by an intelligent video surveillance in making decisions; (b) A modified Clustream data stream clustering algorithm which continuously updates the system knowledge and is able to retrieve contextually related information effectively; and (c) An update-describe approach which extends the capability of the existing human local motion features called interest points based features to the data stream environment.
Resumo:
Automated visual surveillance of crowds is a rapidly growing area of research. In this paper we focus on motion representation for the purpose of abnormality detection in crowded scenes. We propose a novel visual representation called textures of optical flow. The proposed representation measures the uniformity of a flow field in order to detect anomalous objects such as bicycles, vehicles and skateboarders; and can be combined with spatial information to detect other forms of abnormality. We demonstrate that the proposed approach outperforms state-of-the-art anomaly detection algorithms on a large, publicly-available dataset.
Resumo:
While using unmanned systems in combat is not new, what will be new in the foreseeable future is how such systems are used and integrated in the civilian space. The potential use of Unmanned Aerial Vehicles in civil and commercial applications is becoming a fact, and is receiving considerable attention by industry and the research community. The majority of Unmanned Aerial Vehicles performing civilian tasks are restricted to flying only in segregated space, and not within the National Airspace. The areas that UAVs are restricted to flying in are typically not above populated areas, which in turn are the areas most useful for civilian applications. The reasoning behind the current restrictions is mainly due to the fact that current UAV technologies are not able to demonstrate an Equivalent Level of Safety to manned aircraft, particularly in the case of an engine failure which would require an emergency or forced landing. This chapter will preset and guide the reader through a number of developments that would facilitate the integration of UAVs into the National Airspace. Algorithms for UAV Sense-and-Avoid and Force Landings are recognized as two major enabling technologies that will allow the integration of UAVs in the civilian airspace. The following sections will describe some of the techniques that are currently being tested at the Australian Research Centre for Aerospace Automation (ARCAA), which places emphasis on the detection of candidate landing sites using computer vision, the planning of the descent path trajectory for the UAV, and the decision making process behind the selection of the final landing site.
Resumo:
Coral reefs are biologically complex ecosystems that support a wide variety of marine organisms. These are fragile communities under enormous threat from natural and human-based influences. Properly assessing and measuring the growth and health of reefs is essential to understanding impacts of ocean acidification, coastal urbanisation and global warming. In this paper, we present an innovative 3-D reconstruction technique based on visual imagery as a non-intrusive, repeatable, in situ method for estimating physical parameters, such as surface area and volume for efficient assessment of long-term variability. The reconstruction algorithms are presented, and benchmarked using an existing data set. We validate the technique underwater, utilising a commercial-off-the-shelf camera and a piece of staghorn coral, Acropora cervicornis. The resulting reconstruction is compared with a laser scan of the coral piece for assessment and validation. The comparison shows that 77% of the pixels in the reconstruction are within 0.3 mm of the ground truth laser scan. Reconstruction results from an unknown video camera are also presented as a segue to future applications of this research.