37 resultados para COMPUTER VISION
Resumo:
The technique of constructing a transformation, or regrading, of a discrete data set such that the histogram of the transformed data matches a given reference histogram is commonly known as histogram modification. The technique is widely used for image enhancement and normalization. A method which has been previously derived for producing such a regrading is shown to be “best” in the sense that it minimizes the error between the cumulative histogram of the transformed data and that of the given reference function, over all single-valued, monotone, discrete transformations of the data. Techniques for smoothed regrading, which provide a means of balancing the error in matching a given reference histogram against the information lost with respect to a linear transformation are also examined. The smoothed regradings are shown to optimize certain cost functionals. Numerical algorithms for generating the smoothed regradings, which are simple and efficient to implement, are described, and practical applications to the processing of LANDSAT image data are discussed.
Resumo:
Analysis of human behaviour through visual information has been a highly active research topic in the computer vision community. This was previously achieved via images from a conventional camera, but recently depth sensors have made a new type of data available. This survey starts by explaining the advantages of depth imagery, then describes the new sensors that are available to obtain it. In particular, the Microsoft Kinect has made high-resolution real-time depth cheaply available. The main published research on the use of depth imagery for analysing human activity is reviewed. Much of the existing work focuses on body part detection and pose estimation. A growing research area addresses the recognition of human actions. The publicly available datasets that include depth imagery are listed, as are the software libraries that can acquire it from a sensor. This survey concludes by summarising the current state of work on this topic, and pointing out promising future research directions.
Resumo:
The current state of the art and direction of research in computer vision aimed at automating the analysis of CCTV images is presented. This includes low level identification of objects within the field of view of cameras, following those objects over time and between cameras, and the interpretation of those objects’ appearance and movements with respect to models of behaviour (and therefore intentions inferred). The potential ethical problems (and some potential opportunities) such developments may pose if and when deployed in the real world are presented, and suggestions made as to the necessary new regulations which will be needed if such systems are not to further enhance the power of the surveillers against the surveilled.
Resumo:
This paper presents a neuroscience inspired information theoretic approach to motion segmentation. Robust motion segmentation represents a fundamental first stage in many surveillance tasks. As an alternative to widely adopted individual segmentation approaches, which are challenged in different ways by imagery exhibiting a wide range of environmental variation and irrelevant motion, this paper presents a new biologically-inspired approach which computes the multivariate mutual information between multiple complementary motion segmentation outputs. Performance evaluation across a range of datasets and against competing segmentation methods demonstrates robust performance.
Resumo:
The 3D shape of an object and its 3D location have traditionally thought of as very separate entities, although both can be described within a single 3D coordinate frame. Here, 3D shape and location are considered as two aspects of a view-based approach to representing depth, avoiding the use of 3D coordinate frames.
Resumo:
Sparse coding aims to find a more compact representation based on a set of dictionary atoms. A well-known technique looking at 2D sparsity is the low rank representation (LRR). However, in many computer vision applications, data often originate from a manifold, which is equipped with some Riemannian geometry. In this case, the existing LRR becomes inappropriate for modeling and incorporating the intrinsic geometry of the manifold that is potentially important and critical to applications. In this paper, we generalize the LRR over the Euclidean space to the LRR model over a specific Rimannian manifold—the manifold of symmetric positive matrices (SPD). Experiments on several computer vision datasets showcase its noise robustness and superior performance on classification and segmentation compared with state-of-the-art approaches.
Resumo:
For many tasks, such as retrieving a previously viewed object, an observer must form a representation of the world at one location and use it at another. A world-based 3D reconstruction of the scene built up from visual information would fulfil this requirement, something computer vision now achieves with great speed and accuracy. However, I argue that it is neither easy nor necessary for the brain to do this. I discuss biologically plausible alternatives, including the possibility of avoiding 3D coordinate frames such as ego-centric and world-based representations. For example, the distance, slant and local shape of surfaces dictate the propensity of visual features to move in the image with respect to one another as the observer’s perspective changes (through movement or binocular viewing). Such propensities can be stored without the need for 3D reference frames. The problem of representing a stable scene in the face of continual head and eye movements is an appropriate starting place for understanding the goal of 3D vision, more so, I argue, than the case of a static binocular observer.