851 resultados para computer vision face recognition detection voice recognition sistemi biometrici iOS
Resumo:
"C00-1018-1213"--Cover.
Resumo:
Errata sheet inserted.
Resumo:
We introduce a new second-order method of texture analysis called Adaptive Multi-Scale Grey Level Co-occurrence Matrix (AMSGLCM), based on the well-known Grey Level Co-occurrence Matrix (GLCM) method. The method deviates significantly from GLCM in that features are extracted, not via a fixed 2D weighting function of co-occurrence matrix elements, but by a variable summation of matrix elements in 3D localized neighborhoods. We subsequently present a new methodology for extracting optimized, highly discriminant features from these localized areas using adaptive Gaussian weighting functions. Genetic Algorithm (GA) optimization is used to produce a set of features whose classification worth is evaluated by discriminatory power and feature correlation considerations. We critically appraised the performance of our method and GLCM in pairwise classification of images from visually similar texture classes, captured from Markov Random Field (MRF) synthesized, natural, and biological origins. In these cross-validated classification trials, our method demonstrated significant benefits over GLCM, including increased feature discriminatory power, automatic feature adaptability, and significantly improved classification performance.
Resumo:
Beyond the inherent technical challenges, current research into the three dimensional surface correspondence problem is hampered by a lack of uniform terminology, an abundance of application specific algorithms, and the absence of a consistent model for comparing existing approaches and developing new ones. This paper addresses these challenges by presenting a framework for analysing, comparing, developing, and implementing surface correspondence algorithms. The framework uses five distinct stages to establish correspondence between surfaces. It is general, encompassing a wide variety of existing techniques, and flexible, facilitating the synthesis of new correspondence algorithms. This paper presents a review of existing surface correspondence algorithms, and shows how they fit into the correspondence framework. It also shows how the framework can be used to analyse and compare existing algorithms and develop new algorithms using the framework's modular structure. Six algorithms, four existing and two new, are implemented using the framework. Each implemented algorithm is used to match a number of surface pairs. Results demonstrate that the correspondence framework implementations are faithful implementations of existing algorithms, and that powerful new surface correspondence algorithms can be created. (C) 2004 Elsevier Inc. All rights reserved.
Resumo:
This paper defines the 3D reconstruction problem as the process of reconstructing a 3D scene from numerous 2D visual images of that scene. It is well known that this problem is ill-posed, and numerous constraints and assumptions are used in 3D reconstruction algorithms in order to reduce the solution space. Unfortunately, most constraints only work in a certain range of situations and often constraints are built into the most fundamental methods (e.g. Area Based Matching assumes that all the pixels in the window belong to the same object). This paper presents a novel formulation of the 3D reconstruction problem, using a voxel framework and first order logic equations, which does not contain any additional constraints or assumptions. Solving this formulation for a set of input images gives all the possible solutions for that set, rather than picking a solution that is deemed most likely. Using this formulation, this paper studies the problem of uniqueness in 3D reconstruction and how the solution space changes for different configurations of input images. It is found that it is not possible to guarantee a unique solution, no matter how many images are taken of the scene, their orientation or even how much color variation is in the scene itself. Results of using the formulation to reconstruct a few small voxel spaces are also presented. They show that the number of solutions is extremely large for even very small voxel spaces (5 x 5 voxel space gives 10 to 10(7) solutions). This shows the need for constraints to reduce the solution space to a reasonable size. Finally, it is noted that because of the discrete nature of the formulation, the solution space size can be easily calculated, making the formulation a useful tool to numerically evaluate the usefulness of any constraints that are added.
Resumo:
These are the full proceedings of the conference.
Resumo:
Deformable models are an attractive approach to recognizing objects which have considerable within-class variability such as handwritten characters. However, there are severe search problems associated with fitting the models to data which could be reduced if a better starting point for the search were available. We show that by training a neural network to predict how a deformable model should be instantiated from an input image, such improved starting points can be obtained. This method has been implemented for a system that recognizes handwritten digits using deformable models, and the results show that the search time can be significantly reduced without compromising recognition performance. © 1997 Academic Press.
Resumo:
There have been two main approaches to feature detection in human and computer vision - luminance-based and energy-based. Bars and edges might arise from peaks of luminance and luminance gradient respectively, or bars and edges might be found at peaks of local energy, where local phases are aligned across spatial frequency. This basic issue of definition is important because it guides more detailed models and interpretations of early vision. Which approach better describes the perceived positions of elements in a 3-element contour-alignment task? We used the class of 1-D images defined by Morrone and Burr in which the amplitude spectrum is that of a (partially blurred) square wave and Fourier components in a given image have a common phase. Observers judged whether the centre element (eg ±458 phase) was to the left or right of the flanking pair (eg 0º phase). Lateral offset of the centre element was varied to find the point of subjective alignment from the fitted psychometric function. This point shifted systematically to the left or right according to the sign of the centre phase, increasing with the degree of blur. These shifts were well predicted by the location of luminance peaks and other derivative-based features, but not by energy peaks which (by design) predicted no shift at all. These results on contour alignment agree well with earlier ones from a more explicit feature-marking task, and strongly suggest that human vision does not use local energy peaks to locate basic first-order features. [Supported by the Wellcome Trust (ref: 056093)]
Resumo:
A recently proposed colour based tracking algorithm has been established to track objects in real circumstances [Zivkovic, Z., Krose, B. 2004. An EM-like algorithm for color-histogram-based object tracking. In: Proc, IEEE Conf. on Computer Vision and Pattern Recognition, pp. 798-803]. To improve the performance of this technique in complex scenes, in this paper we propose a new algorithm for optimally adapting the ellipse outlining the objects of interest. This paper presents a Lagrangian based method to integrate a regularising component into the covariance matrix to be computed. Technically, we intend to reduce the residuals between the estimated probability distribution and the expected one. We argue that, by doing this, the shape of the ellipse can be properly adapted in the tracking stage. Experimental results show that the proposed method has favourable performance in shape adaption and object localisation.
Resumo:
This paper addresses the problem of obtaining 3d detailed reconstructions of human faces in real-time and with inexpensive hardware. We present an algorithm based on a monocular multi-spectral photometric-stereo setup. This system is known to capture high-detailed deforming 3d surfaces at high frame rates and without having to use any expensive hardware or synchronized light stage. However, the main challenge of such a setup is the calibration stage, which depends on the lights setup and how they interact with the specific material being captured, in this case, human faces. For this purpose we develop a self-calibration technique where the person being captured is asked to perform a rigid motion in front of the camera, maintaining a neutral expression. Rigidity constrains are then used to compute the head's motion with a structure-from-motion algorithm. Once the motion is obtained, a multi-view stereo algorithm reconstructs a coarse 3d model of the face. This coarse model is then used to estimate the lighting parameters with a stratified approach: In the first step we use a RANSAC search to identify purely diffuse points on the face and to simultaneously estimate this diffuse reflectance model. In the second step we apply non-linear optimization to fit a non-Lambertian reflectance model to the outliers of the previous step. The calibration procedure is validated with synthetic and real data.
Resumo:
A recent trend in smart camera networks is that they are able to modify the functionality during runtime to better reflect changes in the observed scenes and in the specified monitoring tasks. In this paper we focus on different configuration methods for such networks. A configuration is given by three components: (i) a description of the camera nodes, (ii) a specification of the area of interest by means of observation points and the associated monitoring activities, and (iii) a description of the analysis tasks. We introduce centralized, distributed and proprioceptive configuration methods and compare their properties and performance. © 2012 IEEE.