112 resultados para Optical pattern recognition.


Relevância:

80.00% 80.00%

Publicador:

Resumo:

We present a model for early vision tasks such as denoising, super-resolution, deblurring, and demosaicing. The model provides a resolution-independent representation of discrete images which admits a truly rotationally invariant prior. The model generalizes several existing approaches: variational methods, finite element methods, and discrete random fields. The primary contribution is a novel energy functional which has not previously been written down, which combines the discrete measurements from pixels with a continuous-domain world viewed through continous-domain point-spread functions. The value of the functional is that simple priors (such as total variation and generalizations) on the continous-domain world become realistic priors on the sampled images. We show that despite its apparent complexity, optimization of this model depends on just a few computational primitives, which although tedious to derive, can now be reused in many domains. We define a set of optimization algorithms which greatly overcome the apparent complexity of this model, and make possible its practical application. New experimental results include infinite-resolution upsampling, and a method for obtaining subpixel superpixels. © 2012 IEEE.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In the modern and dynamic construction environment it is important to access information in a fast and efficient manner in order to improve the decision making processes for construction managers. This capability is, in most cases, straightforward with today’s technologies for data types with an inherent structure that resides primarily on established database structures like estimating and scheduling software. However, previous research has demonstrated that a significant percentage of construction data is stored in semi-structured or unstructured data formats (text, images, etc.) and that manually locating and identifying such data is a very hard and time-consuming task. This paper focuses on construction site image data and presents a novel image retrieval model that interfaces with established construction data management structures. This model is designed to retrieve images from related objects in project models or construction databases using location, date, and material information (extracted from the image content with pattern recognition techniques).

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Statistical approaches for building non-rigid deformable models, such as the Active Appearance Model (AAM), have enjoyed great popularity in recent years, but typically require tedious manual annotation of training images. In this paper, a learning based approach for the automatic annotation of visually deformable objects from a single annotated frontal image is presented and demonstrated on the example of automatically annotating face images that can be used for building AAMs for fitting and tracking. This approach employs the idea of initially learning the correspondences between landmarks in a frontal image and a set of training images with a face in arbitrary poses. Using this learner, virtual images of unseen faces at any arbitrary pose for which the learner was trained can be reconstructed by predicting the new landmark locations and warping the texture from the frontal image. View-based AAMs are then built from the virtual images and used for automatically annotating unseen images, including images of different facial expressions, at any random pose within the maximum range spanned by the virtually reconstructed images. The approach is experimentally validated by automatically annotating face images from three different databases. © 2009 IEEE.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper is about detecting bipedal motion in video sequences by using point trajectories in a framework of classification. Given a number of point trajectories, we find a subset of points which are arising from feet in bipedal motion by analysing their spatio-temporal correlation in a pairwise fashion. To this end, we introduce probabilistic trajectories as our new features which associate each point over a sufficiently long time period in the presence of noise. They are extracted from directed acyclic graphs whose edges represent temporal point correspondences and are weighted with their matching probability in terms of appearance and location. The benefit of the new representation is that it practically tolerates inherent ambiguity for example due to occlusions. We then learn the correlation between the motion of two feet using the probabilistic trajectories in a decision forest classifier. The effectiveness of the algorithm is demonstrated in experiments on image sequences captured with a static camera, and extensions to deal with a moving camera are discussed. © 2013 Elsevier B.V. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This work addresses the challenging problem of unconstrained 3D human pose estimation (HPE) from a novel perspective. Existing approaches struggle to operate in realistic applications, mainly due to their scene-dependent priors, such as background segmentation and multi-camera network, which restrict their use in unconstrained environments. We therfore present a framework which applies action detection and 2D pose estimation techniques to infer 3D poses in an unconstrained video. Action detection offers spatiotemporal priors to 3D human pose estimation by both recognising and localising actions in space-time. Instead of holistic features, e.g. silhouettes, we leverage the flexibility of deformable part model to detect 2D body parts as a feature to estimate 3D poses. A new unconstrained pose dataset has been collected to justify the feasibility of our method, which demonstrated promising results, significantly outperforming the relevant state-of-the-arts. © 2013 IEEE.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper presents a complete system for expressive visual text-to-speech (VTTS), which is capable of producing expressive output, in the form of a 'talking head', given an input text and a set of continuous expression weights. The face is modeled using an active appearance model (AAM), and several extensions are proposed which make it more applicable to the task of VTTS. The model allows for normalization with respect to both pose and blink state which significantly reduces artifacts in the resulting synthesized sequences. We demonstrate quantitative improvements in terms of reconstruction error over a million frames, as well as in large-scale user studies, comparing the output of different systems. © 2013 IEEE.