22 resultados para Side view gait recognition

em Massachusetts Institute of Technology


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes a representation of the dynamics of human walking action for the purpose of person identification and classification by gait appearance. Our gait representation is based on simple features such as moments extracted from video silhouettes of human walking motion. We claim that our gait dynamics representation is rich enough for the task of recognition and classification. The use of our feature representation is demonstrated in the task of person recognition from video sequences of orthogonal views of people walking. We demonstrate the accuracy of recognition on gait video sequences collected over different days and times, and under varying lighting environments. In addition, preliminary results are shown on gender classification using our gait dynamics features.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis describes a representation of gait appearance for the purpose of person identification and classification. This gait representation is based on simple localized image features such as moments extracted from orthogonal view video silhouettes of human walking motion. A suite of time-integration methods, spanning a range of coarseness of time aggregation and modeling of feature distributions, are applied to these image features to create a suite of gait sequence representations. Despite their simplicity, the resulting feature vectors contain enough information to perform well on human identification and gender classification tasks. We demonstrate the accuracy of recognition on gait video sequences collected over different days and times and under varying lighting environments. Each of the integration methods are investigated for their advantages and disadvantages. An improved gait representation is built based on our experiences with the initial set of gait representations. In addition, we show gender classification results using our gait appearance features, the effect of our heuristic feature selection method, and the significance of individual features.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Poggio and Vetter (1992) showed that learning one view of a bilaterally symmetric object could be sufficient for its recognition, if this view allows the computation of a symmetric, "virtual," view. Faces are roughly bilaterally symmetric objects. Learning a side-view--which always has a symmetric view--should allow for better generalization performances than learning the frontal view. Two psychophysical experiments tested these predictions. Stimuli were views of shaded 3D models of laser-scanned faces. The first experiment tested whether a particular view of a face was canonical. The second experiment tested which single views of a face give rise to best generalization performances. The results were compatible with the symmetry hypothesis: Learning a side view allowed better generalization performances than learning the frontal view.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper describes the main features of a view-based model of object recognition. The model tries to capture general properties to be expected in a biological architecture for object recognition. The basic module is a regularization network in which each of the hidden units is broadly tuned to a specific view of the object to be recognized.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

If we are provided a face database with only one example view per person, is it possible to recognize new views of them under a variety of different poses, especially views rotated in depth from the original example view? We investigate using prior knowledge about faces plus each single example view to generate virtual views of each person, or views of the face as seen from different poses. Prior knowledge of faces is represented in an example-based way, using 2D views of a prototype face seen rotating in depth. The synthesized virtual views are evaluated as example views in a view-based approach to pose-invariant face recognition. They are shown to improve the recognition rate over the scenario where only the single real view is used.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A persistent issue of debate in the area of 3D object recognition concerns the nature of the experientially acquired object models in the primate visual system. One prominent proposal in this regard has expounded the use of object centered models, such as representations of the objects' 3D structures in a coordinate frame independent of the viewing parameters [Marr and Nishihara, 1978]. In contrast to this is another proposal which suggests that the viewing parameters encountered during the learning phase might be inextricably linked to subsequent performance on a recognition task [Tarr and Pinker, 1989; Poggio and Edelman, 1990]. The 'object model', according to this idea, is simply a collection of the sample views encountered during training. Given that object centered recognition strategies have the attractive feature of leading to viewpoint independence, they have garnered much of the research effort in the field of computational vision. Furthermore, since human recognition performance seems remarkably robust in the face of imaging variations [Ellis et al., 1989], it has often been implicitly assumed that the visual system employs an object centered strategy. In the present study we examine this assumption more closely. Our experimental results with a class of novel 3D structures strongly suggest the use of a view-based strategy by the human visual system even when it has the opportunity of constructing and using object-centered models. In fact, for our chosen class of objects, the results seem to support a stronger claim: 3D object recognition is 2D view-based.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We describe a psychophysical investigation of the effects of object complexity and familiarity on the variation of recognition time and recognition accuracy over different views of novel 3D objects. Our findings indicate that with practice the response times for different views become more uniform and the initially orderly dependency of the response time on the distance to a "good" view disappears. One possible interpretation of our results is in terms of a tradeoff between memory needed for storing specific-view representations of objects and time spent in recognizing the objects.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We report a series of psychophysical experiments that explore different aspects of the problem of object representation and recognition in human vision. Contrary to the paradigmatic view which holds that the representations are three-dimensional and object-centered, the results consistently support the notion of view-specific representations that include at most partial depth information. In simulated experiments that involved the same stimuli shown to the human subjects, computational models built around two-dimensional multiple-view representations replicated our main psychophysical results, including patterns of generalization errors and the time course of perceptual learning.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Similarity measurements between 3D objects and 2D images are useful for the tasks of object recognition and classification. We distinguish between two types of similarity metrics: metrics computed in image-space (image metrics) and metrics computed in transformation-space (transformation metrics). Existing methods typically use image and the nearest view of the object. Example for such a measure is the Euclidean distance between feature points in the image and corresponding points in the nearest view. (Computing this measure is equivalent to solving the exterior orientation calibration problem.) In this paper we introduce a different type of metrics: transformation metrics. These metrics penalize for the deformatoins applied to the object to produce the observed image. We present a transformation metric that optimally penalizes for "affine deformations" under weak-perspective. A closed-form solution, together with the nearest view according to this metric, are derived. The metric is shown to be equivalent to the Euclidean image metric, in the sense that they bound each other from both above and below. For Euclidean image metric we offier a sub-optimal closed-form solution and an iterative scheme to compute the exact solution.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A scheme for recognizing 3D objects from single 2D images is introduced. The scheme proceeds in two stages. In the first stage, the categorization stage, the image is compared to prototype objects. For each prototype, the view that most resembles the image is recovered, and, if the view is found to be similar to the image, the class identity of the object is determined. In the second stage, the identification stage, the observed object is compared to the individual models of its class, where classes are expected to contain objects with relatively similar shapes. For each model, a view that matches the image is sought. If such a view is found, the object's specific identity is determined. The advantage of categorizing the object before it is identified is twofold. First, the image is compared to a smaller number of models, since only models that belong to the object's class need to be considered. Second, the cost of comparing the image to each model in a classis very low, because correspondence is computed once for the whoel class. More specifically, the correspondence and object pose computed in the categorization stage to align the prototype with the image are reused in the identification stage to align the individual models with the image. As a result, identification is reduced to a series fo simple template comparisons. The paper concludes with an algorithm for constructing optimal prototypes for classes of objects.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the general case, a trilinear relationship between three perspective views is shown to exist. The trilinearity result is shown to be of much practical use in visual recognition by alignment --- yielding a direct method that cuts through the computations of camera transformation, scene structure and epipolar geometry. The proof of the central result may be of further interest as it demonstrates certain regularities across homographies of the plane and introduces new view invariants. Experiments on simulated and real image data were conducted, including a comparative analysis with epipolar intersection and the linear combination methods, with results indicating a greater degree of robustness in practice and a higher level of performance in re-projection tasks.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

How does the brain recognize three-dimensional objects? We trained monkeys to recognize computer rendered objects presented from an arbitrarily chosen training view, and subsequently tested their ability to generalize recognition for other views. Our results provide additional evidence in favor of with a recognition model that accomplishes view-invariant performance by storing a limited number of object views or templates together with the capacity to interpolate between the templates (Poggio and Edelman, 1990).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The inferior temporal cortex (IT) of monkeys is thought to play an essential role in visual object recognition. Inferotemporal neurons are known to respond to complex visual stimuli, including patterns like faces, hands, or other body parts. What is the role of such neurons in object recognition? The present study examines this question in combined psychophysical and electrophysiological experiments, in which monkeys learned to classify and recognize novel visual 3D objects. A population of neurons in IT were found to respond selectively to such objects that the monkeys had recently learned to recognize. A large majority of these cells discharged maximally for one view of the object, while their response fell off gradually as the object was rotated away from the neuron"s preferred view. Most neurons exhibited orientation-dependent responses also during view-plane rotations. Some neurons were found tuned around two views of the same object, while a very small number of cells responded in a view- invariant manner. For five different objects that were extensively used during the training of the animals, and for which behavioral performance became view-independent, multiple cells were found that were tuned around different views of the same object. No selective responses were ever encountered for views that the animal systematically failed to recognize. The results of our experiments suggest that neurons in this area can develop a complex receptive field organization as a consequence of extensive training in the discrimination and recognition of objects. Simple geometric features did not appear to account for the neurons" selective responses. These findings support the idea that a population of neurons -- each tuned to a different object aspect, and each showing a certain degree of invariance to image transformations -- may, as an assembly, encode complex 3D objects. In such a system, several neurons may be active for any given vantage point, with a single unit acting like a blurred template for a limited neighborhood of a single view.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The problem of automatic face recognition is to visually identify a person in an input image. This task is performed by matching the input face against the faces of known people in a database of faces. Most existing work in face recognition has limited the scope of the problem, however, by dealing primarily with frontal views, neutral expressions, and fixed lighting conditions. To help generalize existing face recognition systems, we look at the problem of recognizing faces under a range of viewpoints. In particular, we consider two cases of this problem: (i) many example views are available of each person, and (ii) only one view is available per person, perhaps a driver's license or passport photograph. Ideally, we would like to address these two cases using a simple view-based approach, where a person is represented in the database by using a number of views on the viewing sphere. While the view-based approach is consistent with case (i), for case (ii) we need to augment the single real view of each person with synthetic views from other viewpoints, views we call 'virtual views'. Virtual views are generated using prior knowledge of face rotation, knowledge that is 'learned' from images of prototype faces. This prior knowledge is used to effectively rotate in depth the single real view available of each person. In this thesis, I present the view-based face recognizer, techniques for synthesizing virtual views, and experimental results using real and virtual views in the recognizer.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An investigation is made into the problem of constructing a model of the appearance to an optical input device of scenes consisting of plane-faced geometric solids. The goal is to study algorithms which find the real straight edges in the scenes, taking into account smooth variations in intensity over faces of the solids, blurring of edges and noise. A general mathematical analysis is made of optimal methods for identifying the edge lines in figures, given a raster of intensities covering the entire field of view. There is given in addition a suboptimal statistical decision procedure, based on the model, for the identification of a line within a narrow band on the field of view given an array of intensities from within the band. A computer program has been written and extensively tested which implements this procedure and extracts lines from real scenes. Other programs were written which judge the completeness of extracted sets of lines, and propose and test for additional lines which had escaped initial detection. The performance of these programs is discussed in relation to the theory derived from the model, and with regard to their use of global information in detecting and proposing lines.