994 resultados para Input image


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Estimation of 3D hand pose is useful in many gesture recognition applications, ranging from human-computer interaction to automated recognition of sign languages. In this paper, 3D hand pose estimation is treated as a database indexing problem. Given an input image of a hand, the most similar images in a large database of hand images are retrieved. The hand pose parameters of the retrieved images are used as estimates for the hand pose in the input image. Lipschitz embeddings of edge images into a Euclidean space are used to improve the efficiency of database retrieval. In order to achieve interactive retrieval times, similarity queries are initially performed in this Euclidean space. The paper describes ongoing work that focuses on how to best choose reference images, in order to improve retrieval accuracy.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Ongoing work towards appearance-based 3D hand pose estimation from a single image is presented. A large database of synthetic hand views is generated using a 3D hand model and computer graphics. The views display different hand shapes as seen from arbitrary viewpoints. Each synthetic view is automatically labeled with parameters describing its hand shape and viewing parameters. Given an input image, the system retrieves the most similar database views, and uses the shape and viewing parameters of those views as candidate estimates for the parameters of the input image. Preliminary results are presented, in which appearance-based similarity is defined in terms of the chamfer distance between edge images.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

An appearance-based framework for 3D hand shape classification and simultaneous camera viewpoint estimation is presented. Given an input image of a segmented hand, the most similar matches from a large database of synthetic hand images are retrieved. The ground truth labels of those matches, containing hand shape and camera viewpoint information, are returned by the system as estimates for the input image. Database retrieval is done hierarchically, by first quickly rejecting the vast majority of all database views, and then ranking the remaining candidates in order of similarity to the input. Four different similarity measures are employed, based on edge location, edge orientation, finger location and geometric moments.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A learning based framework is proposed for estimating human body pose from a single image. Given a differentiable function that maps from pose space to image feature space, the goal is to invert the process: estimate the pose given only image features. The inversion is an ill-posed problem as the inverse mapping is a one to many process. Hence multiple solutions exist, and it is desirable to restrict the solution space to a smaller subset of feasible solutions. For example, not all human body poses are feasible due to anthropometric constraints. Since the space of feasible solutions may not admit a closed form description, the proposed framework seeks to exploit machine learning techniques to learn an approximation that is smoothly parameterized over such a space. One such technique is Gaussian Process Latent Variable Modelling. Scaled conjugate gradient is then used find the best matching pose in the space of feasible solutions when given an input image. The formulation allows easy incorporation of various constraints, e.g. temporal consistency and anthropometric constraints. The performance of the proposed approach is evaluated in the task of upper-body pose estimation from silhouettes and compared with the Specialized Mapping Architecture. The estimation accuracy of the Specialized Mapping Architecture is at least one standard deviation worse than the proposed approach in the experiments with synthetic data. In experiments with real video of humans performing gestures, the proposed approach produces qualitatively better estimation results.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Visual salience is an intriguing phenomenon observed in biological neural systems. Numerous attempts have been made to model visual salience mathematically using various feature contrasts, either locally or globally. However, these algorithmic models tend to ignore the problem’s biological solutions, in which visual salience appears to arise during the propagation of visual stimuli along the visual cortex. In this paper, inspired by the conjecture that salience arises from deep propagation along the visual cortex, we present a Deep Salience model where a multi-layer model based on successive Markov random fields (sMRF) is proposed to analyze the input image successively through its deep belief propagation. As a result, the foreground object can be automatically separated from the background in a fully unsupervised way. Experimental evaluation on the benchmark dataset validated that our Deep Salience model can consistently outperform eleven state-of-the-art salience models, yielding the higher rates in the precision-recall tests and attaining the best F-measure and mean-square error in the experiments.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A new class of shape features for region classification and high-level recognition is introduced. The novel Randomised Region Ray (RRR) features can be used to train binary decision trees for object category classification using an abstract representation of the scene. In particular we address the problem of human detection using an over segmented input image. We therefore do not rely on pixel values for training, instead we design and train specialised classifiers on the sparse set of semantic regions which compose the image. Thanks to the abstract nature of the input, the trained classifier has the potential to be fast and applicable to extreme imagery conditions. We demonstrate and evaluate its performance in people detection using a pedestrian dataset.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper, we present a 3D face photography system based on a facial expression training dataset, composed of both facial range images (3D geometry) and facial texture (2D photography). The proposed system allows one to obtain a 3D geometry representation of a given face provided as a 2D photography, which undergoes a series of transformations through the texture and geometry spaces estimated. In the training phase of the system, the facial landmarks are obtained by an active shape model (ASM) extracted from the 2D gray-level photography. Principal components analysis (PCA) is then used to represent the face dataset, thus defining an orthonormal basis of texture and another of geometry. In the reconstruction phase, an input is given by a face image to which the ASM is matched. The extracted facial landmarks and the face image are fed to the PCA basis transform, and a 3D version of the 2D input image is built. Experimental tests using a new dataset of 70 facial expressions belonging to ten subjects as training set show rapid reconstructed 3D faces which maintain spatial coherence similar to the human perception, thus corroborating the efficiency and the applicability of the proposed system.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This work describes a novel methodology for automatic contour extraction from 2D images of 3D neurons (e.g. camera lucida images and other types of 2D microscopy). Most contour-based shape analysis methods cannot be used to characterize such cells because of overlaps between neuronal processes. The proposed framework is specifically aimed at the problem of contour following even in presence of multiple overlaps. First, the input image is preprocessed in order to obtain an 8-connected skeleton with one-pixel-wide branches, as well as a set of critical regions (i.e., bifurcations and crossings). Next, for each subtree, the tracking stage iteratively labels all valid pixel of branches, tip to a critical region, where it determines the suitable direction to proceed. Finally, the labeled skeleton segments are followed in order to yield the parametric contour of the neuronal shape under analysis. The reported system was successfully tested with respect to several images and the results from a set of three neuron images are presented here, each pertaining to a different class, i.e. alpha, delta and epsilon ganglion cells, containing a total of 34 crossings. The algorithms successfully got across all these overlaps. The method has also been found to exhibit robustness even for images with close parallel segments. The proposed method is robust and may be implemented in an efficient manner. The introduction of this approach should pave the way for more systematic application of contour-based shape analysis methods in neuronal morphology. (C) 2008 Elsevier B.V. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The problem of dimensional defects in aluminum die- casting is widespread throughout the foundry industry and their detection is of paramount importance in maintaining product quality. Due to the unpredictable factory environment and metallic, with highly reflective, nature of aluminum die-castings, it is extremely hard to estimate true dimensionality of the die-casting, autonomously. In this work, we propose a novel robust 3D reconstruction algorithm capable of reconstructing dimensionally accurate 3D depth models of the aluminum die-castings. The developed system is very simple and cost effective as it consists of only a stereo cameras pair and a simple fluorescent light. The developed system is capable of estimating surface depths within the tolerance of 1.5 mm. Moreover, the system is invariant to illuminative variations and orientation of the objects in the input image space, which makes the developed system highly robust. Due to its hardware simplicity and robustness, it can be implemented in different factory environments without a significant change in the setup.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

An automatic road sign recognition system first locates road signs within images captured by an imaging sensor on-board of a vehicle, and then identifies the detected road signs. This paper presents an automatic neural-network-based road sign recognition system. First, a study of the existing road sign recognition research is presented. In this study, the issues associated with automatic road sign recognition are described, the existing methods developed to tackle the road sign recognition problem are reviewed, and a comparison of the features of these methods is given. Second, the developed road sign recognition system is described. The system is capable of analysing live colour road scene images, detecting multiple road signs within each image, and classifying the type of road signs detected. The system consists of two modules: detection and classification. The detection module segments the input image in the hue-saturation-intensity colour space, and then detects road signs using a Multi-layer Perceptron neural-network. The classification module determines the type of detected road signs using a series of one to one architectural Multi-layer Perceptron neural networks. Two sets of classifiers are trained using the Resillient-Backpropagation and Scaled-Conjugate-Gradient algorithms. The two modules of the system are evaluated individually first. Then the system is tested as a whole. The experimental results demonstrate that the system is capable of achieving an average recognition hit-rate of 95.96% using the scaled-conjugate-gradient trained classifiers.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The problem of dimensional defects in aluminum die-castings is widespread throughout the foundry industry and their detection is of paramount importance in maintaining product quality. Due to the unpredictable factory environment and metallic with highly reflective nature, it is extremely hard to estimate true dimensionality of these metallic parts, autonomously. Some existing vision systems are capable of estimating depth to high accuracy, however are very much hardware dependent, involving the use of light and laser pattern projectors, integrated into vision systems or laser scanners. However, due to the reflective nature of these metallic parts and variable factory environments, the aforementioned vision systems tend to exhibit unpromising performance. Moreover, hardware dependency makes these systems cumbersome and costly. In this work, we propose a novel robust 3D reconstruction algorithm capable of reconstructing dimensionally accurate 3D depth models of the aluminum die-castings. The developed system is very simple and cost effective as it consists of only a pair of stereo cameras and a defused fluorescent light. The proposed vision system is capable of estimating surface depths within the accuracy of 0.5mm. In addition, the system is invariant to illuminative variations as well as orientation and location of the objects on the input image space, making the developed system highly robust. Due to its hardware simplicity and robustness, it can be implemented in different factory environments without a significant change in the setup. The proposed system is a major part of quality inspection system for the automotive manufacturing industry.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The problem of dimensional defects in aluminum die-casting is widespread throughout the foundry industry and their detection is of paramount importance in maintaining product quality. Due to the unpredictable factory environment and metallic, with highly reflective, nature of aluminum die-castings, it is extremely hard to estimate true dimensionality of the die-casting, autonomously. In this work, we propose a novel robust 3D reconstruction algorithm capable of reconstructing dimensionally accurate 3D depth models of the aluminum die-castings. The developed system is very simple and cost effective as it consists of only a stereo camera pair and a simple fluorescent light. The developed system is capable of estimating surface depths within the tolerance of 1.5 mm. Moreover, the system is invariant to illuminative variations and orientation of the objects in the input image space, which makes the developed system highly robust. Due to its hardware simplicity and robustness, it can be implemented in different factory environments without a significant change in the setup.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A 3D binary image is considered well-composed if, and only if, the union of the faces shared by the foreground and background voxels of the image is a surface in R3. Wellcomposed images have some desirable topological properties, which allow us to simplify and optimize algorithms that are widely used in computer graphics, computer vision and image processing. These advantages have fostered the development of algorithms to repair bi-dimensional (2D) and three-dimensional (3D) images that are not well-composed. These algorithms are known as repairing algorithms. In this dissertation, we propose two repairing algorithms, one randomized and one deterministic. Both algorithms are capable of making topological repairs in 3D binary images, producing well-composed images similar to the original images. The key idea behind both algorithms is to iteratively change the assigned color of some points in the input image from 0 (background)to 1 (foreground) until the image becomes well-composed. The points whose colors are changed by the algorithms are chosen according to their values in the fuzzy connectivity map resulting from the image segmentation process. The use of the fuzzy connectivity map ensures that a subset of points chosen by the algorithm at any given iteration is the one with the least affinity with the background among all possible choices

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Pós-graduação em Ciência da Computação - IBILCE

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)