36 resultados para Invariant Object Recognition

em Deakin Research Online - Australia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Traditional methods of object recognition are reliant on shape and so are very difficult to apply in cluttered, wideangle and low-detail views such as surveillance scenes. To address this, a method of indirect object recognition is proposed, where human activity is used to infer both the location and identity of objects. No shape analysis is necessary. The concept is dubbed 'interaction signatures', since the premise is that a human will interact with objects in ways characteristic of the function of that object - for example, a person sits in a chair and drinks from a cup. The human-centred approach means that recognition is possible in low-detail views and is largely invariant to the shape of objects within the same functional class. This paper implements a Bayesian network for classifying region patches with object labels, building upon our previous work in automatically segmenting and recognising a human's interactions with the objects. Experiments show that interaction signatures can successfully find and label objects in low-detail views and are equally effective at recognising test objects that differ markedly in appearance from the training objects.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Over the course of the last decade, infrared (IR) and particularly thermal IR imaging based face recognition has emerged as a promising complement to conventional, visible spectrum based approaches which continue to struggle when applied in practice. While inherently insensitive to visible spectrum illumination changes, IR data introduces specific challenges of its own, most notably sensitivity to factors which affect facial heat emission patterns, e.g. emotional state, ambient temperature, and alcohol intake. In addition, facial expression and pose changes are more difficult to correct in IR images because they are less rich in high frequency detail which is an important cue for fitting any deformable model. In this paper we describe a novel method which addresses these major challenges. Specifically, when comparing two thermal IR images of faces, we mutually normalize their poses and facial expressions by using an active appearance model (AAM) to generate synthetic images of the two faces with a neutral facial expression and in the same view (the average of the two input views). This is achieved by piecewise affine warping which follows AAM fitting. A major contribution of our work is the use of an AAM ensemble in which each AAM is specialized to a particular range of poses and a particular region of the thermal IR face space. Combined with the contributions from our previous work which addressed the problem of reliable AAM fitting in the thermal IR spectrum, and the development of a person-specific representation robust to transient changes in the pattern of facial temperature emissions, the proposed ensemble framework accurately matches faces across the full range of yaw from frontal to profile, even in the presence of scale variation (e.g. due to the varying distance of a subject from the camera). The effectiveness of the proposed approach is demonstrated on the largest public database of thermal IR images of faces and a newly acquired data set of thermal IR motion videos. Our approach achieved perfect recognition performance on both data sets, significantly outperforming the current state of the art methods even when they are trained with multiple images spanning a range of head views.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Illumination and pose invariance are the most challenging aspects of face recognition. In this paper we describe a fully automatic face recognition system that uses video information to achieve illumination and pose robustness. In the proposed method, highly nonlinear manifolds of face motion are approximated using three Gaussian pose clusters. Pose robustness is achieved by comparing the corresponding pose clusters and probabilistically combining the results to derive a measure of similarity between two manifolds. Illumination is normalized on a per-pose basis. Region-based gamma intensity correction is used to correct for coarse illumination changes, while further refinement is achieved by combining a learnt linear manifold of illumination variation with constraints on face pattern distribution, derived from video. Comparative experimental evaluation is presented and the proposed method is shown to greatly outperform state-of-the-art algorithms. Consistent recognition rates of 94-100% are achieved across dramatic changes in illumination.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The problem of object recognition is of immense practical importance and potential, and the last decade has witnessed a number of breakthroughs in the state of the art. Most of the past object recognition work focuses on textured objects and local appearance descriptors extracted around salient points in an image. These methods fail in the matching of smooth, untextured objects for which salient point detection does not produce robust results. The recently proposed bag of boundaries (BoB) method is the first to directly address this problem. Since the texture of smooth objects is largely uninformative, BoB focuses on describing and matching objects based on their post-segmentation boundaries. Herein we address three major weaknesses of this work. The first of these is the uniform treatment of all boundary segments. Instead, we describe a method for detecting the locations and scales of salient boundary segments. Secondly, while the BoB method uses an image based elementary descriptor (HoGs + occupancy matrix), we propose a more compact descriptor based on the local profile of boundary normals’ directions. Lastly, we conduct a far more systematic evaluation, both of the bag of boundaries method and the method proposed here. Using a large public database, we demonstrate that our method exhibits greater robustness while at the same time achieving a major computational saving – object representation is extracted from an image in only 6% of the time needed to extract a bag of boundaries, and the storage requirement is similarly reduced to less than 8%.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Linear subspace representations of appearance variation are pervasive in computer vision. This paper addresses the problem of robustly matching such subspaces (computing the similarity between them) when they are used to describe the scope of variations within sets of images of different (possibly greatly so) scales. A naïve solution of projecting the low-scale subspace into the high-scale image space is described first and subsequently shown to be inadequate, especially at large scale discrepancies. A successful approach is proposed instead. It consists of (i) an interpolated projection of the low-scale subspace into the high-scale space, which is followed by (ii) a rotation of this initial estimate within the bounds of the imposed "downsampling constraint". The optimal rotation is found in the closed-form which best aligns the high-scale reconstruction of the low-scale subspace with the reference it is compared to. The method is evaluated on the problem of matching sets of (i) face appearances under varying illumination and (ii) object appearances under varying viewpoint, using two large data sets. In comparison to the naïve matching, the proposed algorithm is shown to greatly increase the separation of between-class and within-class similarities, as well as produce far more meaningful modes of common appearance on which the match score is based. © 2014 Elsevier Ltd.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Monitoring marine object is important for understanding the marine ecosystem and evaluating impacts on different environmental changes. One prerequisite of monitoring is to identify targets of interest. Traditionally, the target objects are recognized by trained scientists through towed nets and human observation, which cause much cost and risk to operators and creatures. In comparison, a noninvasive way via setting up a camera and seeking objects in images is more promising. In this paper, a novel technique of object detection in images is presented, which is applicable to generic objects. A robust background modelling algorithm is proposed to extract foregrounds and then blob features are introduced to classify foregrounds. Particular marine objects, box jellyfish and sea snake, are successfully detected in our work. Experiments conducted on image datasets collected by the Australian Institute of Marine Science (AIMS) demonstrate the effectiveness of the proposed technique.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

While the primary purpose of edge detection schemes is to be able to produce an edge map of a given image, the ability to distinguish between different feature types is also of importance. In this paper we examine feature classification based on local energy detection and show that local energy measures are intrinsically capable of making this classification because of the use of odd and even filters. The advantage of feature classification is that it allows for the elimination of certain feature types from the edge map, thus simplifying the task of object recognition.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The use of interaction signatures to recognize objects without considering the object's physical structure is discussed. Without object recognition, smart homes cannot make full use of video cameras because vision systems cannot provide object-related context to the human activities monitored. One important advantage of interaction signatures is that people frequently and repeatedly interact with household objects, so the system can build evidence for object locations and labels.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper describes an investigation into the use of parametric 2D models describing the movement of edges for the determination of possible 3D shape and hence function of an object. An assumption of this research is that the camera can foveate and track particular features. It is argued that simple 2D analytic descriptions of the movement of edges can infer 3D shape while the camera is moved. This uses an advantage of foveation i.e. the problem becomes object centred. The problem of correspondence for numerous edge points is overcome by the use of a tree based representation for the competing hypotheses. Numerous hypothesis are maintained simultaneously and it does not rely on a single kinematic model which assumes constant velocity or acceleration. The numerous advantages of this strategy are described.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper describes a general purpose flexible technique which uses physical modelling techniques for determining the features of a 3D object that are visible from any predefined view. Physical modelling techniques are used to determine which of many different types of features are visible from a complete set of viewpoints. The power of this technique lies in its ability to detect and parameterise object features, regardless of object complexity. Raytracing is used to simulate the physical process by which object features are visible so that surface properties (eg specularity, transparency) as well as object boundaries can be used in the recognition process. Using this technique occluding and non-occluding edge based features are extracted using image processing techniques and then parameterised. Features caused by specularity are also extracted and qualitative descriptions for these are defined.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The identification of useful structures in home video is difficult because this class of video is distinguished from other video sources by its unrestricted, non edited content and the absence of regulated storyline. In addition, home videos contain a lot of motion and erratic camera movements, with shots of the same character being captured from various angles and viewpoints. In this paper, we present a solution to the challenging problem of clustering shots and faces in home videos, based on the use of SIFT features. SIFT features have been known to be robust for object recognition; however, in dealing with the complexities of home video setting, the matching process needs to be augmented and adapted. This paper describes various techniques that can improve the number of matches returned as well as the correctness of matches. For example, existing methods for verification of matches are inadequate for cases when a small number of matches are returned, a common situation in home videos. We address this by constructing a robust classifier that works on matching sets instead of individual matches, allowing the exploitation of the geometric constraints between matches. Finally, we propose techniques for robustly extracting target clusters from individual feature matches.