119 resultados para computer vision, geometric variations, congealing, unsupervised image alignment
Resumo:
We present a multispectral photometric stereo method for capturing geometry of deforming surfaces. A novel photometric calibration technique allows calibration of scenes containing multiple piecewise constant chromaticities. This method estimates per-pixel photometric properties, then uses a RANSAC-based approach to estimate the dominant chromaticities in the scene. A likelihood term is developed linking surface normal, image intensity and photometric properties, which allows estimating the number of chromaticities present in a scene to be framed as a model estimation problem. The Bayesian Information Criterion is applied to automatically estimate the number of chromaticities present during calibration. A two-camera stereo system provides low resolution geometry, allowing the likelihood term to be used in segmenting new images into regions of constant chromaticity. This segmentation is carried out in a Markov Random Field framework and allows the correct photometric properties to be used at each pixel to estimate a dense normal map. Results are shown on several challenging real-world sequences, demonstrating state-of-the-art results using only two cameras and three light sources. Quantitative evaluation is provided against synthetic ground truth data. © 2011 IEEE.
Resumo:
This book explores the processes for retrieval, classification, and integration of construction images in AEC/FM model based systems. The author describes a combination of techniques from the areas of image and video processing, computer vision, information retrieval, statistics and content-based image and video retrieval that have been integrated into a novel method for the retrieval of related construction site image data from components of a project model. This method has been tested on available construction site images from a variety of sources like past and current building construction and transportation projects and is able to automatically classify, store, integrate and retrieve image data files in inter-organizational systems so as to allow their usage in project management related tasks. objects. Therefore, automated methods for the integration of construction images are important for construction information management. During this research, processes for retrieval, classification, and integration of construction images in AEC/FM model based systems have been explored. Specifically, a combination of techniques from the areas of image and video processing, computer vision, information retrieval, statistics and content-based image and video retrieval have been deployed in order to develop a methodology for the retrieval of related construction site image data from components of a project model. This method has been tested on available construction site images from a variety of sources like past and current building construction and transportation projects and is able to automatically classify, store, integrate and retrieve image data files in inter-organizational systems so as to allow their usage in project management related tasks.
Resumo:
The Architecture, Engineering, Construction and Facilities Management (AEC/FM) industry is rapidly becoming a multidisciplinary, multinational and multi-billion dollar economy, involving large numbers of actors working concurrently at different locations and using heterogeneous software and hardware technologies. Since the beginning of the last decade, a great deal of effort has been spent within the field of construction IT in order to integrate data and information from most computer tools used to carry out engineering projects. For this purpose, a number of integration models have been developed, like web-centric systems and construction project modeling, a useful approach in representing construction projects and integrating data from various civil engineering applications. In the modern, distributed and dynamic construction environment it is important to retrieve and exchange information from different sources and in different data formats in order to improve the processes supported by these systems. Previous research demonstrated that a major hurdle in AEC/FM data integration in such systems is caused by its variety of data types and that a significant part of the data is stored in semi-structured or unstructured formats. Therefore, new integrative approaches are needed to handle non-structured data types like images and text files. This research is focused on the integration of construction site images. These images are a significant part of the construction documentation with thousands stored in site photographs logs of large scale projects. However, locating and identifying such data needed for the important decision making processes is a very hard and time-consuming task, while so far, there are no automated methods for associating them with other related objects. Therefore, automated methods for the integration of construction images are important for construction information management. During this research, processes for retrieval, classification, and integration of construction images in AEC/FM model based systems have been explored. Specifically, a combination of techniques from the areas of image and video processing, computer vision, information retrieval, statistics and content-based image and video retrieval have been deployed in order to develop a methodology for the retrieval of related construction site image data from components of a project model. This method has been tested on available construction site images from a variety of sources like past and current building construction and transportation projects and is able to automatically classify, store, integrate and retrieve image data files in inter-organizational systems so as to allow their usage in project management related tasks.
Resumo:
The US National Academy of Engineering recently identified restoring and improving urban infrastructure as one of the grand challenges of engineering. Part of this challenge stems from the lack of viable methods to map/label existing infrastructure. For computer vision, this challenge becomes “How can we automate the process of extracting geometric, object oriented models of infrastructure from visual data?” Object recognition and reconstruction methods have been successfully devised and/or adapted to answer this question for small or linear objects (e.g. columns). However, many infrastructure objects are large and/or planar without significant and distinctive features, such as walls, floor slabs, and bridge decks. How can we recognize and reconstruct them in a 3D model? In this paper, strategies for infrastructure object recognition and reconstruction are presented, to set the stage for posing the question above and discuss future research in featureless, large/planar object recognition and modeling.
Resumo:
This paper presents a novel way to speed up the evaluation time of a boosting classifier. We make a shallow (flat) network deep (hierarchical) by growing a tree from decision regions of a given boosting classifier. The tree provides many short paths for speeding up while preserving the reasonably smooth decision regions of the boosting classifier for good generalisation. For converting a boosting classifier into a decision tree, we formulate a Boolean optimization problem, which has been previously studied for circuit design but limited to a small number of binary variables. In this work, a novel optimisation method is proposed for, firstly, several tens of variables i.e. weak-learners of a boosting classifier, and then any larger number of weak-learners by using a two-stage cascade. Experiments on the synthetic and face image data sets show that the obtained tree achieves a significant speed up both over a standard boosting classifier and the Fast-exit-a previously described method for speeding-up boosting classification, at the same accuracy. The proposed method as a general meta-algorithm is also useful for a boosting cascade, where it speeds up individual stage classifiers by different gains. The proposed method is further demonstrated for fast-moving object tracking and segmentation problems. © 2011 Springer Science+Business Media, LLC.
Resumo:
We present a model for early vision tasks such as denoising, super-resolution, deblurring, and demosaicing. The model provides a resolution-independent representation of discrete images which admits a truly rotationally invariant prior. The model generalizes several existing approaches: variational methods, finite element methods, and discrete random fields. The primary contribution is a novel energy functional which has not previously been written down, which combines the discrete measurements from pixels with a continuous-domain world viewed through continous-domain point-spread functions. The value of the functional is that simple priors (such as total variation and generalizations) on the continous-domain world become realistic priors on the sampled images. We show that despite its apparent complexity, optimization of this model depends on just a few computational primitives, which although tedious to derive, can now be reused in many domains. We define a set of optimization algorithms which greatly overcome the apparent complexity of this model, and make possible its practical application. New experimental results include infinite-resolution upsampling, and a method for obtaining subpixel superpixels. © 2012 IEEE.
Resumo:
The loss mechanisms which control 2D incidence range are discussed with an emphasis on determining which real in-service geometric variations will have the largest impact. For the majority of engine compressor blades (Minlet>0.55) both the negative and positive incidence limits are controlled by supersonic patches. It is shown that these patches are highly sensitive to the geometric variations close to, and around the leading edge. The variations used in this study were measured from newly manufactured as well as ex-service blades. Over most the high pressure compressor considered, it was shown that manufacture variations dominated. The first part of the paper shows that, despite large geometric variations (~10% of leading edge thickness), the incidence range responded in a linear way. The result of this is that the geometric variations have little effect on the mean incidence range of a row of blades. In the second part of the paper a region of the design space is identified where non-linear behavior can result in a 10% reduction in positive incidence range. The mechanism for this is reported and design guidelines for its avoidance offered. In the final part of the paper, the linear behavior at negative incidence and the transonic nature of the flow is exploited to design a robust asymmetric leading edge with a 5% increase in incidence range.
Resumo:
Statistical approaches for building non-rigid deformable models, such as the Active Appearance Model (AAM), have enjoyed great popularity in recent years, but typically require tedious manual annotation of training images. In this paper, a learning based approach for the automatic annotation of visually deformable objects from a single annotated frontal image is presented and demonstrated on the example of automatically annotating face images that can be used for building AAMs for fitting and tracking. This approach employs the idea of initially learning the correspondences between landmarks in a frontal image and a set of training images with a face in arbitrary poses. Using this learner, virtual images of unseen faces at any arbitrary pose for which the learner was trained can be reconstructed by predicting the new landmark locations and warping the texture from the frontal image. View-based AAMs are then built from the virtual images and used for automatically annotating unseen images, including images of different facial expressions, at any random pose within the maximum range spanned by the virtually reconstructed images. The approach is experimentally validated by automatically annotating face images from three different databases. © 2009 IEEE.
Resumo:
Localization of chess-board vertices is a common task in computer vision, underpinning many applications, but relatively little work focusses on designing a specific feature detector that is fast, accurate and robust. In this paper the 'Chess-board Extraction by Subtraction and Summation' (ChESS) feature detector, designed to exclusively respond to chess-board vertices, is presented. The method proposed is robust against noise, poor lighting and poor contrast, requires no prior knowledge of the extent of the chess-board pattern, is computationally very efficient, and provides a strength measure of detected features. Such a detector has significant application both in the key field of camera calibration, as well as in structured light 3D reconstruction. Evidence is presented showing its superior robustness, accuracy, and efficiency in comparison to other commonly used detectors, including Harris & Stephens and SUSAN, both under simulation and in experimental 3D reconstruction of flat plate and cylindrical objects. © 2013 Elsevier Inc. All rights reserved.
Resumo:
Statistical analysis of diffusion tensor imaging (DTI) data requires a computational framework that is both numerically tractable (to account for the high dimensional nature of the data) and geometric (to account for the nonlinear nature of diffusion tensors). Building upon earlier studies exploiting a Riemannian framework to address these challenges, the present paper proposes a novel metric and an accompanying computational framework for DTI data processing. The proposed approach grounds the signal processing operations in interpolating curves. Well-chosen interpolating curves are shown to provide a computational framework that is at the same time tractable and information relevant for DTI processing. In addition, and in contrast to earlier methods, it provides an interpolation method which preserves anisotropy, a central information carried by diffusion tensor data. © 2013 Springer Science+Business Media New York.
Resumo:
Temporal synchronization of multiple video recordings of the same dynamic event is a critical task in many computer vision applications e.g. novel view synthesis and 3D reconstruction. Typically this information is implied through the time-stamp information embedded in the video streams. User-generated videos shot using consumer grade equipment do not contain this information; hence, there is a need to temporally synchronize signals using the visual information itself. Previous work in this area has either assumed good quality data with relatively simple dynamic content or the availability of precise camera geometry. Our first contribution is a synchronization technique which tries to establish correspondence between feature trajectories across views in a novel way, and specifically targets the kind of complex content found in consumer generated sports recordings, without assuming precise knowledge of fundamental matrices or homographies. We evaluate performance using a number of real video recordings and show that our method is able to synchronize to within 1 sec, which is significantly better than previous approaches. Our second contribution is a robust and unsupervised view-invariant activity recognition descriptor that exploits recurrence plot theory on spatial tiles. The descriptor is individually shown to better characterize the activities from different views under occlusions than state-of-the-art approaches. We combine this descriptor with our proposed synchronization method and show that it can further refine the synchronization index. © 2013 ACM.