144 resultados para Unresolved vision problem
Innovative Stereo Vision-Based Approach to Generate Dense Depth Map of Transportation Infrastructure
Resumo:
Three-dimensional (3-D) spatial data of a transportation infrastructure contain useful information for civil engineering applications, including as-built documentation, on-site safety enhancements, and progress monitoring. Several techniques have been developed for acquiring 3-D point coordinates of infrastructure, such as laser scanning. Although the method yields accurate results, the high device costs and human effort required render the process infeasible for generic applications in the construction industry. A quick and reliable approach, which is based on the principles of stereo vision, is proposed for generating a depth map of an infrastructure. Initially, two images are captured by two similar stereo cameras at the scene of the infrastructure. A Harris feature detector is used to extract feature points from the first view, and an innovative adaptive window-matching technique is used to compute feature point correspondences in the second view. A robust algorithm computes the nonfeature point correspondences. Thus, the correspondences of all the points in the scene are obtained. After all correspondences have been obtained, the geometric principles of stereo vision are used to generate a dense depth map of the scene. The proposed algorithm has been tested on several data sets, and results illustrate its potential for stereo correspondence and depth map generation.
Resumo:
Camera motion estimation is one of the most significant steps for structure-from-motion (SFM) with a monocular camera. The normalized 8-point, the 7-point, and the 5-point algorithms are normally adopted to perform the estimation, each of which has distinct performance characteristics. Given unique needs and challenges associated to civil infrastructure SFM scenarios, selection of the proper algorithm directly impacts the structure reconstruction results. In this paper, a comparison study of the aforementioned algorithms is conducted to identify the most suitable algorithm, in terms of accuracy and reliability, for reconstructing civil infrastructure. The free variables tested are baseline, depth, and motion. A concrete girder bridge was selected as the "test-bed" to reconstruct using an off-the-shelf camera capturing imagery from all possible positions that maximally the bridge's features and geometry. The feature points in the images were extracted and matched via the SURF descriptor. Finally, camera motions are estimated based on the corresponding image points by applying the aforementioned algorithms, and the results evaluated.
Resumo:
Vision based tracking can provide the spatial location of project related entities such as equipment, workers, and materials in a large-scale congested construction site. It tracks entities in a video stream by inferring their motion. To initiate the process, it is required to determine the pixel areas of the entities to be tracked in the following consecutive video frames. For the purpose of fully automating the process, this paper presents an automated way of initializing trackers using Semantic Texton Forests (STFs) method. STFs method performs simultaneously the segmentation of the image and the classification of the segments based on the low-level semantic information and the context information. In this paper, STFs method is tested in the case of wheel loaders recognition. In the experiments, wheel loaders are further divided into several parts such as wheels and body parts to help learn the context information. The results show 79% accuracy of recognizing the pixel areas of the wheel loader. These results signify that STFs method has the potential to automate the initialization process of vision based tracking.
Resumo:
Pavement condition assessment is essential when developing road network maintenance programs. In practice, pavement sensing is to a large extent automated when regarding highway networks. Municipal roads, however, are predominantly surveyed manually due to the limited amount of expensive inspection vehicles. As part of a research project that proposes an omnipresent passenger vehicle network for comprehensive and cheap condition surveying of municipal road networks this paper deals with pothole recognition. Existing methods either rely on expensive and high-maintenance range sensors, or make use of acceleration data, which can only provide preliminary and rough condition surveys. In our previous work we created a pothole detection method for pavement images. In this paper we present an improved recognition method for pavement videos that incrementally updates the texture signature for intact pavement regions and uses vision tracking to track detected potholes. The method is tested and results demonstrate its reasonable efficiency.
Resumo:
The existing machine vision-based 3D reconstruction software programs provide a promising low-cost and in some cases automatic solution for infrastructure as-built documentation. However in several steps of the reconstruction process, they only rely on detecting and matching corner-like features in multiple views of a scene. Therefore, in infrastructure scenes which include uniform materials and poorly textured surfaces, these programs fail with high probabilities due to lack of feature points. Moreover, except few programs that generate dense 3D models through significantly time-consuming algorithms, most of them only provide a sparse reconstruction which does not necessarily include required points such as corners or edges; hence these points have to be manually matched across different views that could make the process considerably laborious. To address these limitations, this paper presents a video-based as-built documentation method that automatically builds detailed 3D maps of a scene by aligning edge points between video frames. Compared to corner-like features, edge points are far more plentiful even in untextured scenes and often carry important semantic associations. The method has been tested for poorly textured infrastructure scenes and the results indicate that a combination of edge and corner-like features would allow dealing with a broader range of scenes.
Resumo:
Vision based tracking can provide the spatial location of construction entities such as equipment, workers, and materials in large scale, congested construction sites. It tracks entities in video streams by inferring their locations based on the entities’ visual features and motion histories. To initiate the process, it is necessary to determine the pixel areas corresponding to the construction entities to be tracked in the following consecutive video frames. In order to fully automate the process, an automated way of initialization is needed. This paper presents the method for construction worker detection which can automatically recognize and localize construction workers in video frames. The method first finds the foreground areas of moving objects using a background subtraction method. Within these foreground areas, construction workers are recognized based on the histogram of oriented gradients (HOG) and histogram of the HSV colors. HOG’s have proved to work effectively for detection of people, and the histogram of HSV colors helps differentiate between pedestrians and construction workers wearing safety vests. Preliminary experiments show that the proposed method has the potential to automate the initialization process of vision based tracking.
Resumo:
Most of the existing automated machine vision-based techniques for as-built documentation of civil infrastructure utilize only point features to recover the 3D structure of a scene. However it is often the case in man-made structures that not enough point features can be reliably detected (e.g. buildings and roofs); this can potentially lead to the failure of these techniques. To address the problem, this paper utilizes the prominence of straight lines in infrastructure scenes. It presents a hybrid approach that benefits from both point and line features. A calibrated stereo set of video cameras is used to collect data. Point and line features are then detected and matched across video frames. Finally, the 3D structure of the scene is recovered by finding 3D coordinates of the matched features. The proposed approach has been tested on realistic outdoor environments and preliminary results indicate its capability to deal with a variety of scenes.
Resumo:
This book will be of particular interest to academics, researchers, and graduate students at universities and industrial practitioners seeking to apply mobile and pervasive computing systems to improve construction industry productivity.