914 resultados para stereo depth
Resumo:
We investigate the properties of feedforward neural networks trained with Hebbian learning algorithms. A new unsupervised algorithm is proposed which produces statistically uncorrelated outputs. The algorithm causes the weights of the network to converge to the eigenvectors of the input correlation with largest eigenvalues. The algorithm is closely related to the technique of Self-supervised Backpropagation, as well as other algorithms for unsupervised learning. Applications of the algorithm to texture processing, image coding, and stereo depth edge detection are given. We show that the algorithm can lead to the development of filters qualitatively similar to those found in primate visual cortex.
Resumo:
The staircase is presented as the architectural component that most potently embodies thresholds, boundaries and passages due to its diagonal orientation and essence as an intermediary zone. Connections then are made between the kinesthetic requirements of traversing a staircase and viewing a stereoscopic photograph. From this foundation, the haptic essence of stereoscopic photography is proposed as uniquely qualified medium through which to view a staircase and therefore thresholds, boundaries, and passages within architecture. Analyses of stereoviews of staircases in the Palais de Justice in Brussels, the Library of Congress in Washington, and the Palais Garnier (Opéra) in Paris close the essay.
Innovative Stereo Vision-Based Approach to Generate Dense Depth Map of Transportation Infrastructure
Resumo:
Three-dimensional (3-D) spatial data of a transportation infrastructure contain useful information for civil engineering applications, including as-built documentation, on-site safety enhancements, and progress monitoring. Several techniques have been developed for acquiring 3-D point coordinates of infrastructure, such as laser scanning. Although the method yields accurate results, the high device costs and human effort required render the process infeasible for generic applications in the construction industry. A quick and reliable approach, which is based on the principles of stereo vision, is proposed for generating a depth map of an infrastructure. Initially, two images are captured by two similar stereo cameras at the scene of the infrastructure. A Harris feature detector is used to extract feature points from the first view, and an innovative adaptive window-matching technique is used to compute feature point correspondences in the second view. A robust algorithm computes the nonfeature point correspondences. Thus, the correspondences of all the points in the scene are obtained. After all correspondences have been obtained, the geometric principles of stereo vision are used to generate a dense depth map of the scene. The proposed algorithm has been tested on several data sets, and results illustrate its potential for stereo correspondence and depth map generation.
Innovative Stereo Vision-Based Approach to Generate Dense Depth Map of Transportation Infrastructure
Resumo:
Three-dimensional (3-D) spatial data of a transportation infrastructure contain useful information for civil engineering applications, including as-built documentation, on-site safety enhancements, and progress monitoring. Several techniques have been developed for acquiring 3-D point coordinates of infrastructure, such as laser scanning. Although the method yields accurate results, the high device costs and human effort required render the process infeasible for generic applications in the construction industry. A quick and reliable approach, which is based on the principles of stereo vision, is proposed for generating a depth map of an infrastructure. Initially, two images are captured by two similar stereo cameras at the scene of the infrastructure. A Harris feature detector is used to extract feature points from the first view, and an innovative adaptive window-matching technique is used to compute feature point correspondences in the second view. A robust algorithm computes the nonfeature point correspondences. Thus, the correspondences of all the points in the scene are obtained. After all correspondences have been obtained, the geometric principles of stereo vision are used to generate a dense depth map of the scene. The proposed algorithm has been tested on several data sets, and results illustrate its potential for stereo correspondence and depth map generation.
Resumo:
We present a system for augmenting depth camera output using multispectral photometric stereo. The technique is demonstrated using a Kinect sensor and is able to produce geometry independently for each frame. Improved reconstruction is demonstrated using the Kinect's inbuilt RGB camera and further improvements are achieved by introducing an additional high resolution camera. As well as qualitative improvements in reconstruction a quantitative reduction in temporal noise is shown. As part of the system an approach is presented for relaxing the assumption of multispectral photometric stereo that scenes are of constant chromaticity to the assumption that scenes contain multiple piecewise constant chromaticities.
Resumo:
With luminance gratings, psychophysical thresholds for detecting a small increase in the contrast of a weak ‘pedestal’ grating are 2–3 times lower than for detection of a grating when the pedestal is absent. This is the ‘dipper effect’ – a reliable improvement whose interpretation remains controversial. Analogies between luminance and depth (disparity) processing have attracted interest in the existence of a ‘disparity dipper’. Are thresholds for disparity modulation (corrugated surfaces), facilitated by the presence of a weak disparity-modulated pedestal? We used a 14-bit greyscale to render small disparities accurately, and measured 2AFC discrimination thresholds for disparity modulation (0.3 or 0.6 c/deg) of a random texture at various pedestal levels. In the first experiment, a clear dipper was found. Thresholds were about 2× lower with weak pedestals than without. But here the phase of modulation (0 or 180 deg) was varied from trial to trial. In a noisy signal-detection framework, this creates uncertainty that is reduced by the pedestal, which thus improves performance. When the uncertainty was eliminated by keeping phase constant within sessions, the dipper effect was weak or absent. Monte Carlo simulations showed that the influence of uncertainty could account well for the results of both experiments. A corollary is that the visual depth response to small disparities is probably linear, with no threshold-like nonlinearity.
Resumo:
Measurement of detection and discrimination thresholds yields information about visual signal processing. For luminance contrast, we are 2 - 3 times more sensitive to a small increase in the contrast of a weak 'pedestal' grating, than when the pedestal is absent. This is the 'dipper effect' - a reliable improvement whose interpretation remains controversial. Analogies between luminance and depth (disparity) processing have attracted interest in the existence of a 'disparity dipper' - are thresholds for disparity, or disparity modulation (corrugated surfaces), facilitated by the presence of a weak pedestal? Lunn and Morgan (1997 Journal of the Optical Society of America A 14 360 - 371) found no dipper for disparity-modulated gratings, but technical limitations (8-bit greyscale) might have prevented the necessary measurement of very small disparity thresholds. We used a true 14-bit greyscale to render small disparities accurately, and measured 2AFC discrimination thresholds for disparity modulation (0.6 cycle deg-1) of a random texture at various pedestal levels. Which interval contained greater modulation of depth? In the first experiment, a clear dipper was found. Thresholds were about 2X1 lower with weak pedestals than without. But here the phase of modulation (0° or 180°) was randomised from trial to trial. In a noisy signal-detection framework, this creates uncertainty that is reduced by the pedestal, thus improving performance. When the uncertainty was eliminated by keeping phase constant within sessions, the dipper effect disappeared, confirming Lunn and Morgan's result. The absence of a dipper, coupled with shallow psychometric slopes, suggests that the visual response to small disparities is essentially linear, with no threshold-like nonlinearity.
Resumo:
Stereo vision is a method of depth perception, in which depth information is inferred from two (or more) images of a scene, taken from different perspectives. Practical applications for stereo vision include aerial photogrammetry, autonomous vehicle guidance, robotics and industrial automation. The initial motivation behind this work was to produce a stereo vision sensor for mining automation applications. For such applications, the input stereo images would consist of close range scenes of rocks. A fundamental problem faced by matching algorithms is the matching or correspondence problem. This problem involves locating corresponding points or features in two images. For this application, speed, reliability, and the ability to produce a dense depth map are of foremost importance. This work implemented a number of areabased matching algorithms to assess their suitability for this application. Area-based techniques were investigated because of their potential to yield dense depth maps, their amenability to fast hardware implementation, and their suitability to textured scenes such as rocks. In addition, two non-parametric transforms, the rank and census, were also compared. Both the rank and the census transforms were found to result in improved reliability of matching in the presence of radiometric distortion - significant since radiometric distortion is a problem which commonly arises in practice. In addition, they have low computational complexity, making them amenable to fast hardware implementation. Therefore, it was decided that matching algorithms using these transforms would be the subject of the remainder of the thesis. An analytic expression for the process of matching using the rank transform was derived from first principles. This work resulted in a number of important contributions. Firstly, the derivation process resulted in one constraint which must be satisfied for a correct match. This was termed the rank constraint. The theoretical derivation of this constraint is in contrast to the existing matching constraints which have little theoretical basis. Experimental work with actual and contrived stereo pairs has shown that the new constraint is capable of resolving ambiguous matches, thereby improving match reliability. Secondly, a novel matching algorithm incorporating the rank constraint has been proposed. This algorithm was tested using a number of stereo pairs. In all cases, the modified algorithm consistently resulted in an increased proportion of correct matches. Finally, the rank constraint was used to devise a new method for identifying regions of an image where the rank transform, and hence matching, are more susceptible to noise. The rank constraint was also incorporated into a new hybrid matching algorithm, where it was combined a number of other ideas. These included the use of an image pyramid for match prediction, and a method of edge localisation to improve match accuracy in the vicinity of edges. Experimental results obtained from the new algorithm showed that the algorithm is able to remove a large proportion of invalid matches, and improve match accuracy.
Resumo:
Stereo vision is a method of depth perception, in which depth information is inferred from two (or more) images of a scene, taken from different perspectives. Applications of stereo vision include aerial photogrammetry, autonomous vehicle guidance, robotics, industrial automation and stereomicroscopy. A key issue in stereo vision is that of image matching, or identifying corresponding points in a stereo pair. The difference in the positions of corresponding points in image coordinates is termed the parallax or disparity. When the orientation of the two cameras is known, corresponding points may be projected back to find the location of the original object point in world coordinates. Matching techniques are typically categorised according to the nature of the matching primitives they use and the matching strategy they employ. This report provides a detailed taxonomy of image matching techniques, including area based, transform based, feature based, phase based, hybrid, relaxation based, dynamic programming and object space methods. A number of area based matching metrics as well as the rank and census transforms were implemented, in order to investigate their suitability for a real-time stereo sensor for mining automation applications. The requirements of this sensor were speed, robustness, and the ability to produce a dense depth map. The Sum of Absolute Differences matching metric was the least computationally expensive; however, this metric was the most sensitive to radiometric distortion. Metrics such as the Zero Mean Sum of Absolute Differences and Normalised Cross Correlation were the most robust to this type of distortion but introduced additional computational complexity. The rank and census transforms were found to be robust to radiometric distortion, in addition to having low computational complexity. They are therefore prime candidates for a matching algorithm for a stereo sensor for real-time mining applications. A number of issues came to light during this investigation which may merit further work. These include devising a means to evaluate and compare disparity results of different matching algorithms, and finding a method of assigning a level of confidence to a match. Another issue of interest is the possibility of statistically combining the results of different matching algorithms, in order to improve robustness.
Resumo:
Gait energy images (GEIs) and its variants form the basis of many recent appearance-based gait recognition systems. The GEI combines good recognition performance with a simple implementation, though it suffers problems inherent to appearance-based approaches, such as being highly view dependent. In this paper, we extend the concept of the GEI to 3D, to create what we call the gait energy volume, or GEV. A basic GEV implementation is tested on the CMU MoBo database, showing improvements over both the GEI baseline and a fused multi-view GEI approach. We also demonstrate the efficacy of this approach on partial volume reconstructions created from frontal depth images, which can be more practically acquired, for example, in biometric portals implemented with stereo cameras, or other depth acquisition systems. Experiments on frontal depth images are evaluated on an in-house developed database captured using the Microsoft Kinect, and demonstrate the validity of the proposed approach.
Resumo:
We present an iterative hierarchical algorithm for multi-view stereo. The algorithm attempts to utilise as much contextual information as is available to compute highly accurate and robust depth maps. There are three novel aspects to the approach: 1) firstly we incrementally improve the depth fidelity as the algorithm progresses through the image pyramid; 2) secondly we show how to incorporate visual hull information (when available) to constrain depth searches; and 3) we show how to simultaneously enforce the consistency of the depth-map by continual comparison with neighbouring depth-maps. We show that this approach produces highly accurate depth-maps and, since it is essentially a local method, is both extremely fast and simple to implement.
Resumo:
The mining environment, being complex, irregular, and time-varying, presents a challenging prospect for stereo vision. For this application, speed, reliability, and the ability to produce a dense depth map are of foremost importance. This paper evaluates a number of matching techniques for possible use in a stereo vision sensor for mining automation applications. Area-based techniques have been investigated because they have the potential to yield dense maps, are amenable to fast hardware implementation, and are suited to textured scenes. In addition, two nonparametric transforms, namely, rank and census, have been investigated. Matching algorithms using these transforms were found to have a number of clear advantages, including reliability in the presence of radiometric distortion, low computational complexity, and amenability to hardware implementation.