920 resultados para Computer vision teaching


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Advances in the development of computer vision, miniature Micro-Electro-Mechanical Systems (MEMS) and Wireless Sensor Network (WSN) offer intriguing possibilities that can radically alter the paradigms underlying existing methods of condition assessment and monitoring of ageing civil engineering infrastructure. This paper describes some of the outcomes of the European Science Foundation project "Micro-Measurement and Monitoring System for Ageing Underground Infrastructures (Underground M3)". The main aim of the project was to develop a system that uses a tiered approach to monitor the degree and rate of tunnel deterioration. The system comprises of (1) Tier 1: Micro-detection using advances in computer vision and (2) Tier 2: Micro-monitoring and communication using advances in MEMS and WSN. These potentially low-cost technologies will be able to reduce costs associated with end-of-life structures, which is essential to the viability of rehabilitation, repair and reuse. The paper describes the actual deployment and testing of these innovative monitoring tools in tunnels of London Underground, Prague Metro and Barcelona Metro. © 2012 Taylor & Francis Group.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Localization of chess-board vertices is a common task in computer vision, underpinning many applications, but relatively little work focusses on designing a specific feature detector that is fast, accurate and robust. In this paper the `Chess-board Extraction by Subtraction and Summation' (ChESS) feature detector, designed to exclusively respond to chess-board vertices, is presented. The method proposed is robust against noise, poor lighting and poor contrast, requires no prior knowledge of the extent of the chess-board pattern, is computationally very efficient, and provides a strength measure of detected features. Such a detector has significant application both in the key field of camera calibration, as well as in Structured Light 3D reconstruction. Evidence is presented showing its robustness, accuracy, and efficiency in comparison to other commonly used detectors both under simulation and in experimental 3D reconstruction of flat plate and cylindrical objects

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Statistical approaches for building non-rigid deformable models, such as the Active Appearance Model (AAM), have enjoyed great popularity in recent years, but typically require tedious manual annotation of training images. In this paper, a learning based approach for the automatic annotation of visually deformable objects from a single annotated frontal image is presented and demonstrated on the example of automatically annotating face images that can be used for building AAMs for fitting and tracking. This approach employs the idea of initially learning the correspondences between landmarks in a frontal image and a set of training images with a face in arbitrary poses. Using this learner, virtual images of unseen faces at any arbitrary pose for which the learner was trained can be reconstructed by predicting the new landmark locations and warping the texture from the frontal image. View-based AAMs are then built from the virtual images and used for automatically annotating unseen images, including images of different facial expressions, at any random pose within the maximum range spanned by the virtually reconstructed images. The approach is experimentally validated by automatically annotating face images from three different databases. © 2009 IEEE.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper presents the first performance evaluation of interest points on scalar volumetric data. Such data encodes 3D shape, a fundamental property of objects. The use of another such property, texture (i.e. 2D surface colouration), or appearance, for object detection, recognition and registration has been well studied; 3D shape less so. However, the increasing prevalence of 3D shape acquisition techniques and the diminishing returns to be had from appearance alone have seen a surge in 3D shape-based methods. In this work, we investigate the performance of several state of the art interest points detectors in volumetric data, in terms of repeatability, number and nature of interest points. Such methods form the first step in many shape-based applications. Our detailed comparison, with both quantitative and qualitative measures on synthetic and real 3D data, both point-based and volumetric, aids readers in selecting a method suitable for their application. © 2012 Springer Science+Business Media, LLC.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This work addresses the challenging problem of unconstrained 3D human pose estimation (HPE) from a novel perspective. Existing approaches struggle to operate in realistic applications, mainly due to their scene-dependent priors, such as background segmentation and multi-camera network, which restrict their use in unconstrained environments. We therfore present a framework which applies action detection and 2D pose estimation techniques to infer 3D poses in an unconstrained video. Action detection offers spatiotemporal priors to 3D human pose estimation by both recognising and localising actions in space-time. Instead of holistic features, e.g. silhouettes, we leverage the flexibility of deformable part model to detect 2D body parts as a feature to estimate 3D poses. A new unconstrained pose dataset has been collected to justify the feasibility of our method, which demonstrated promising results, significantly outperforming the relevant state-of-the-arts. © 2013 IEEE.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper presents a complete system for expressive visual text-to-speech (VTTS), which is capable of producing expressive output, in the form of a 'talking head', given an input text and a set of continuous expression weights. The face is modeled using an active appearance model (AAM), and several extensions are proposed which make it more applicable to the task of VTTS. The model allows for normalization with respect to both pose and blink state which significantly reduces artifacts in the resulting synthesized sequences. We demonstrate quantitative improvements in terms of reconstruction error over a million frames, as well as in large-scale user studies, comparing the output of different systems. © 2013 IEEE.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Temporal synchronization of multiple video recordings of the same dynamic event is a critical task in many computer vision applications e.g. novel view synthesis and 3D reconstruction. Typically this information is implied, since recordings are made using the same timebase, or time-stamp information is embedded in the video streams. Recordings using consumer grade equipment do not contain this information; hence, there is a need to temporally synchronize signals using the visual information itself. Previous work in this area has either assumed good quality data with relatively simple dynamic content or the availability of precise camera geometry. In this paper, we propose a technique which exploits feature trajectories across views in a novel way, and specifically targets the kind of complex content found in consumer generated sports recordings, without assuming precise knowledge of fundamental matrices or homographies. Our method automatically selects the moving feature points in the two unsynchronized videos whose 2D trajectories can be best related, thereby helping to infer the synchronization index. We evaluate performance using a number of real recordings and show that synchronization can be achieved to within 1 sec, which is better than previous approaches. Copyright 2013 ACM.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Localization of chess-board vertices is a common task in computer vision, underpinning many applications, but relatively little work focusses on designing a specific feature detector that is fast, accurate and robust. In this paper the 'Chess-board Extraction by Subtraction and Summation' (ChESS) feature detector, designed to exclusively respond to chess-board vertices, is presented. The method proposed is robust against noise, poor lighting and poor contrast, requires no prior knowledge of the extent of the chess-board pattern, is computationally very efficient, and provides a strength measure of detected features. Such a detector has significant application both in the key field of camera calibration, as well as in structured light 3D reconstruction. Evidence is presented showing its superior robustness, accuracy, and efficiency in comparison to other commonly used detectors, including Harris & Stephens and SUSAN, both under simulation and in experimental 3D reconstruction of flat plate and cylindrical objects. © 2013 Elsevier Inc. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Statistical analysis of diffusion tensor imaging (DTI) data requires a computational framework that is both numerically tractable (to account for the high dimensional nature of the data) and geometric (to account for the nonlinear nature of diffusion tensors). Building upon earlier studies exploiting a Riemannian framework to address these challenges, the present paper proposes a novel metric and an accompanying computational framework for DTI data processing. The proposed approach grounds the signal processing operations in interpolating curves. Well-chosen interpolating curves are shown to provide a computational framework that is at the same time tractable and information relevant for DTI processing. In addition, and in contrast to earlier methods, it provides an interpolation method which preserves anisotropy, a central information carried by diffusion tensor data. © 2013 Springer Science+Business Media New York.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Temporal synchronization of multiple video recordings of the same dynamic event is a critical task in many computer vision applications e.g. novel view synthesis and 3D reconstruction. Typically this information is implied through the time-stamp information embedded in the video streams. User-generated videos shot using consumer grade equipment do not contain this information; hence, there is a need to temporally synchronize signals using the visual information itself. Previous work in this area has either assumed good quality data with relatively simple dynamic content or the availability of precise camera geometry. Our first contribution is a synchronization technique which tries to establish correspondence between feature trajectories across views in a novel way, and specifically targets the kind of complex content found in consumer generated sports recordings, without assuming precise knowledge of fundamental matrices or homographies. We evaluate performance using a number of real video recordings and show that our method is able to synchronize to within 1 sec, which is significantly better than previous approaches. Our second contribution is a robust and unsupervised view-invariant activity recognition descriptor that exploits recurrence plot theory on spatial tiles. The descriptor is individually shown to better characterize the activities from different views under occlusions than state-of-the-art approaches. We combine this descriptor with our proposed synchronization method and show that it can further refine the synchronization index. © 2013 ACM.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Relative (comparative) attributes are promising for thematic ranking of visual entities, which also aids in recognition tasks. However, attribute rank learning often requires a substantial amount of relational supervision, which is highly tedious, and apparently impractical for real-world applications. In this paper, we introduce the Semantic Transform, which under minimal supervision, adaptively finds a semantic feature space along with a class ordering that is related in the best possible way. Such a semantic space is found for every attribute category. To relate the classes under weak supervision, the class ordering needs to be refined according to a cost function in an iterative procedure. This problem is ideally NP-hard, and we thus propose a constrained search tree formulation for the same. Driven by the adaptive semantic feature space representation, our model achieves the best results to date for all of the tasks of relative, absolute and zero-shot classification on two popular datasets. © 2013 IEEE.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Multi-frame image super-resolution (SR) aims to utilize information from a set of low-resolution (LR) images to compose a high-resolution (HR) one. As it is desirable or essential in many real applications, recent years have witnessed the growing interest in the problem of multi-frame SR reconstruction. This set of algorithms commonly utilizes a linear observation model to construct the relationship between the recorded LR images to the unknown reconstructed HR image estimates. Recently, regularization-based schemes have been demonstrated to be effective because SR reconstruction is actually an ill-posed problem. Working within this promising framework, this paper first proposes two new regularization items, termed as locally adaptive bilateral total variation and consistency of gradients, to keep edges and flat regions, which are implicitly described in LR images, sharp and smooth, respectively. Thereafter, the combination of the proposed regularization items is superior to existing regularization items because it considers both edges and flat regions while existing ones consider only edges. Thorough experimental results show the effectiveness of the new algorithm for SR reconstruction. (C) 2009 Elsevier B.V. All rights reserved.