916 resultados para Computer vision


Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper, we aim to reconstruct free-from 3D models from a single view by learning the prior knowledge of a specific class of objects. Instead of heuristically proposing specific regularities and defining parametric models as previous research, our shape prior is learned directly from existing 3D models under a framework based on the Gaussian Process Latent Variable Model (GPLVM). The major contributions of the paper include: 1) a probabilistic framework for prior-based reconstruction we propose, which requires no heuristic of the object, and can be easily generalized to handle various categories of 3D objects, and 2) an attempt at automatic reconstruction of more complex 3D shapes, like human bodies, from 2D silhouettes only. Qualitative and quantitative experimental results on both synthetic and real data demonstrate the efficacy of our new approach. ©2009 IEEE.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper presents a new online multi-classifier boosting algorithm for learning object appearance models. In many cases the appearance model is multi-modal, which we capture by training and updating multiple strong classifiers. The proposed algorithm jointly learns the classifiers and a soft partitioning of the input space, defining an area of expertise for each classifier. We show how this formulation improves the specificity of the strong classifiers, allowing simultaneous location and pose estimation in a tracking task. The proposed online scheme iteratively adapts the classifiers during tracking. Experiments show that the algorithm successfully learns multi-modal appearance models during a short initial training phase, subsequently updating them for tracking an object under rapid appearance changes. © 2010 IEEE.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper presents an incremental learning solution for Linear Discriminant Analysis (LDA) and its applications to object recognition problems. We apply the sufficient spanning set approximation in three steps i.e. update for the total scatter matrix, between-class scatter matrix and the projected data matrix, which leads an online solution which closely agrees with the batch solution in accuracy while significantly reducing the computational complexity. The algorithm yields an efficient solution to incremental LDA even when the number of classes as well as the set size is large. The incremental LDA method has been also shown useful for semi-supervised online learning. Label propagation is done by integrating the incremental LDA into an EM framework. The method has been demonstrated in the task of merging large datasets which were collected during MPEG standardization for face image retrieval, face authentication using the BANCA dataset, and object categorisation using the Caltech101 dataset. © 2010 Springer Science+Business Media, LLC.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We present a novel, implementation friendly and occlusion aware semi-supervised video segmentation algorithm using tree structured graphical models, which delivers pixel labels alongwith their uncertainty estimates. Our motivation to employ supervision is to tackle a task-specific segmentation problem where the semantic objects are pre-defined by the user. The video model we propose for this problem is based on a tree structured approximation of a patch based undirected mixture model, which includes a novel time-series and a soft label Random Forest classifier participating in a feedback mechanism. We demonstrate the efficacy of our model in cutting out foreground objects and multi-class segmentation problems in lengthy and complex road scene sequences. Our results have wide applicability, including harvesting labelled video data for training discriminative models, shape/pose/articulation learning and large scale statistical analysis to develop priors for video segmentation. © 2011 IEEE.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We present a video-based system which interactively captures the geometry of a 3D object in the form of a point cloud, then recognizes and registers known objects in this point cloud in a matter of seconds (fig. 1). In order to achieve interactive speed, we exploit both efficient inference algorithms and parallel computation, often on a GPU. The system can be broken down into two distinct phases: geometry capture, and object inference. We now discuss these in further detail. © 2011 IEEE.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper presents a method for vote-based 3D shape recognition and registration, in particular using mean shift on 3D pose votes in the space of direct similarity transforms for the first time. We introduce a new distance between poses in this spacethe SRT distance. It is left-invariant, unlike Euclidean distance, and has a unique, closed-form mean, in contrast to Riemannian distance, so is fast to compute. We demonstrate improved performance over the state of the art in both recognition and registration on a real and challenging dataset, by comparing our distance with others in a mean shift framework, as well as with the commonly used Hough voting approach. © 2011 IEEE.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We present a novel framework for identifying and tracking dominant agents in groups. Our proposed approach relies on a causality detection scheme that is capable of ranking agents with respect to their contribution in shaping the system's collective behaviour based exclusively on the agents' observed trajectories. Further, the reasoning paradigm is made robust to multiple emissions and clutter by employing a class of recently introduced Markov chain Monte Carlo-based group tracking methods. Examples are provided that demonstrate the strong potential of the proposed scheme in identifying actual leaders in swarms of interacting agents and moving crowds. © 2011 IEEE.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We propose a novel model for the spatio-temporal clustering of trajectories based on motion, which applies to challenging street-view video sequences of pedestrians captured by a mobile camera. A key contribution of our work is the introduction of novel probabilistic region trajectories, motivated by the non-repeatability of segmentation of frames in a video sequence. Hierarchical image segments are obtained by using a state-of-the-art hierarchical segmentation algorithm, and connected from adjacent frames in a directed acyclic graph. The region trajectories and measures of confidence are extracted from this graph using a dynamic programming-based optimisation. Our second main contribution is a Bayesian framework with a twofold goal: to learn the optimal, in a maximum likelihood sense, Random Forests classifier of motion patterns based on video features, and construct a unique graph from region trajectories of different frames, lengths and hierarchical levels. Finally, we demonstrate the use of Isomap for effective spatio-temporal clustering of the region trajectories of pedestrians. We support our claims with experimental results on new and existing challenging video sequences. © 2011 IEEE.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper tackles the novel challenging problem of 3D object phenotype recognition from a single 2D silhouette. To bridge the large pose (articulation or deformation) and camera viewpoint changes between the gallery images and query image, we propose a novel probabilistic inference algorithm based on 3D shape priors. Our approach combines both generative and discriminative learning. We use latent probabilistic generative models to capture 3D shape and pose variations from a set of 3D mesh models. Based on these 3D shape priors, we generate a large number of projections for different phenotype classes, poses, and camera viewpoints, and implement Random Forests to efficiently solve the shape and pose inference problems. By model selection in terms of the silhouette coherency between the query and the projections of 3D shapes synthesized using the galleries, we achieve the phenotype recognition result as well as a fast approximate 3D reconstruction of the query. To verify the efficacy of the proposed approach, we present new datasets which contain over 500 images of various human and shark phenotypes and motions. The experimental results clearly show the benefits of using the 3D priors in the proposed method over previous 2D-based methods. © 2011 IEEE.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We present a multispectral photometric stereo method for capturing geometry of deforming surfaces. A novel photometric calibration technique allows calibration of scenes containing multiple piecewise constant chromaticities. This method estimates per-pixel photometric properties, then uses a RANSAC-based approach to estimate the dominant chromaticities in the scene. A likelihood term is developed linking surface normal, image intensity and photometric properties, which allows estimating the number of chromaticities present in a scene to be framed as a model estimation problem. The Bayesian Information Criterion is applied to automatically estimate the number of chromaticities present during calibration. A two-camera stereo system provides low resolution geometry, allowing the likelihood term to be used in segmenting new images into regions of constant chromaticity. This segmentation is carried out in a Markov Random Field framework and allows the correct photometric properties to be used at each pixel to estimate a dense normal map. Results are shown on several challenging real-world sequences, demonstrating state-of-the-art results using only two cameras and three light sources. Quantitative evaluation is provided against synthetic ground truth data. © 2011 IEEE.