944 resultados para stereo vision,stereo matching,cuda,lisp,connection machine
Resumo:
An image processing observational technique for the stereoscopic reconstruction of the wave form of oceanic sea states is developed. The technique incorporates the enforcement of any given statistical wave law modeling the quasi Gaussianity of oceanic waves observed in nature. The problem is posed in a variational optimization framework, where the desired wave form is obtained as the minimizer of a cost functional that combines image observations, smoothness priors and a weak statistical constraint. The minimizer is obtained combining gradient descent and multigrid methods on the necessary optimality equations of the cost functional. Robust photometric error criteria and a spatial intensity compensation model are also developed to improve the performance of the presented image matching strategy. The weak statistical constraint is thoroughly evaluated in combination with other elements presented to reconstruct and enforce constraints on experimental stereo data, demonstrating the improvement in the estimation of the observed ocean surface.
Resumo:
Although 3DTV has led the evolution of television market, its delivery by broadcast networks is still small. Now, 3DTV transmis-sions are usually done by combining both views into one common frame (side by side) to be able to use standard HDTV transmission equipment. Today, orthogonal subsampling is mostly used, but other alternatives will appear soon. Here, different subsampling schemes for both progressive and interlaced 3DTV are considered. For each possible scheme, its pre-served frequency content is analyzed and a simple interpolation filter is designed. The analysis is carried out for progressive and interlaced video and the designed filters are applied on different sequences, showing the advantages and disadvantages of the different options
Resumo:
We develop a novel remote sensing technique for the observation of waves on the ocean surface. Our method infers the 3-D waveform and radiance of oceanic sea states via a variational stereo imagery formulation. In this setting, the shape and radiance of the wave surface are given by minimizers of a composite energy functional that combines a photometric matching term along with regularization terms involving the smoothness of the unknowns. The desired ocean surface shape and radiance are the solution of a system of coupled partial differential equations derived from the optimality conditions of the energy functional. The proposed method is naturally extended to study the spatiotemporal dynamics of ocean waves and applied to three sets of stereo video data. Statistical and spectral analysis are carried out. Our results provide evidence that the observed omnidirectional wavenumber spectrum S(k) decays as k-2.5 is in agreement with Zakharov's theory (1999). Furthermore, the 3-D spectrum of the reconstructed wave surface is exploited to estimate wave dispersion and currents.
Resumo:
Validating modern oceanographic theories using models produced through stereo computer vision principles has recently emerged. Space-time (4-D) models of the ocean surface may be generated by stacking a series of 3-D reconstructions independently generated for each time instant or, in a more robust manner, by simultaneously processing several snapshots coherently in a true ?4-D reconstruction.? However, the accuracy of these computer-vision-generated models is subject to the estimations of camera parameters, which may be corrupted under the influence of natural factors such as wind and vibrations. Therefore, removing the unpredictable errors of the camera parameters is necessary for an accurate reconstruction. In this paper, we propose a novel algorithm that can jointly perform a 4-D reconstruction as well as correct the camera parameter errors introduced by external factors. The technique is founded upon variational optimization methods to benefit from their numerous advantages: continuity of the estimated surface in space and time, robustness, and accuracy. The performance of the proposed algorithm is tested using synthetic data produced through computer graphics techniques, based on which the errors of the camera parameters arising from natural factors can be simulated.
Resumo:
El principal objetivo de este trabajo es proporcionar una solución en tiempo real basada en visión estéreo o monocular precisa y robusta para que un vehículo aéreo no tripulado (UAV) sea autónomo en varios tipos de aplicaciones UAV, especialmente en entornos abarrotados sin señal GPS. Este trabajo principalmente consiste en tres temas de investigación de UAV basados en técnicas de visión por computador: (I) visual tracking, proporciona soluciones efectivas para localizar visualmente objetos de interés estáticos o en movimiento durante el tiempo que dura el vuelo del UAV mediante una aproximación adaptativa online y una estrategia de múltiple resolución, de este modo superamos los problemas generados por las diferentes situaciones desafiantes, tales como cambios significativos de aspecto, iluminación del entorno variante, fondo del tracking embarullado, oclusión parcial o total de objetos, variaciones rápidas de posición y vibraciones mecánicas a bordo. La solución ha sido utilizada en aterrizajes autónomos, inspección de plataformas mar adentro o tracking de aviones en pleno vuelo para su detección y evasión; (II) odometría visual: proporciona una solución eficiente al UAV para estimar la posición con 6 grados de libertad (6D) usando únicamente la entrada de una cámara estéreo a bordo del UAV. Un método Semi-Global Blocking Matching (SGBM) eficiente basado en una estrategia grueso-a-fino ha sido implementada para una rápida y profunda estimación del plano. Además, la solución toma provecho eficazmente de la información 2D y 3D para estimar la posición 6D, resolviendo de esta manera la limitación de un punto de referencia fijo en la cámara estéreo. Una robusta aproximación volumétrica de mapping basada en el framework Octomap ha sido utilizada para reconstruir entornos cerrados y al aire libre bastante abarrotados en 3D con memoria y errores correlacionados espacialmente o temporalmente; (III) visual control, ofrece soluciones de control prácticas para la navegación de un UAV usando Fuzzy Logic Controller (FLC) con la estimación visual. Y el framework de Cross-Entropy Optimization (CEO) ha sido usado para optimizar el factor de escala y la función de pertenencia en FLC. Todas las soluciones basadas en visión en este trabajo han sido probadas en test reales. Y los conjuntos de datos de imágenes reales grabados en estos test o disponibles para la comunidad pública han sido utilizados para evaluar el rendimiento de estas soluciones basadas en visión con ground truth. Además, las soluciones de visión presentadas han sido comparadas con algoritmos de visión del estado del arte. Los test reales y los resultados de evaluación muestran que las soluciones basadas en visión proporcionadas han obtenido rendimientos en tiempo real precisos y robustos, o han alcanzado un mejor rendimiento que aquellos algoritmos del estado del arte. La estimación basada en visión ha ganado un rol muy importante en controlar un UAV típico para alcanzar autonomía en aplicaciones UAV. ABSTRACT The main objective of this dissertation is providing real-time accurate robust monocular or stereo vision-based solution for Unmanned Aerial Vehicle (UAV) to achieve the autonomy in various types of UAV applications, especially in GPS-denied dynamic cluttered environments. This dissertation mainly consists of three UAV research topics based on computer vision technique: (I) visual tracking, it supplys effective solutions to visually locate interesting static or moving object over time during UAV flight with on-line adaptivity approach and multiple-resolution strategy, thereby overcoming the problems generated by the different challenging situations, such as significant appearance change, variant surrounding illumination, cluttered tracking background, partial or full object occlusion, rapid pose variation and onboard mechanical vibration. The solutions have been utilized in autonomous landing, offshore floating platform inspection and midair aircraft tracking for sense-and-avoid; (II) visual odometry: it provides the efficient solution for UAV to estimate the 6 Degree-of-freedom (6D) pose using only the input of stereo camera onboard UAV. An efficient Semi-Global Blocking Matching (SGBM) method based on a coarse-to-fine strategy has been implemented for fast depth map estimation. In addition, the solution effectively takes advantage of both 2D and 3D information to estimate the 6D pose, thereby solving the limitation of a fixed small baseline in the stereo camera. A robust volumetric occupancy mapping approach based on the Octomap framework has been utilized to reconstruct indoor and outdoor large-scale cluttered environments in 3D with less temporally or spatially correlated measurement errors and memory; (III) visual control, it offers practical control solutions to navigate UAV using Fuzzy Logic Controller (FLC) with the visual estimation. And the Cross-Entropy Optimization (CEO) framework has been used to optimize the scaling factor and the membership function in FLC. All the vision-based solutions in this dissertation have been tested in real tests. And the real image datasets recorded from these tests or available from public community have been utilized to evaluate the performance of these vision-based solutions with ground truth. Additionally, the presented vision solutions have compared with the state-of-art visual algorithms. Real tests and evaluation results show that the provided vision-based solutions have obtained real-time accurate robust performances, or gained better performance than those state-of-art visual algorithms. The vision-based estimation has played a critically important role for controlling a typical UAV to achieve autonomy in the UAV application.
Resumo:
With luminance gratings, psychophysical thresholds for detecting a small increase in the contrast of a weak ‘pedestal’ grating are 2–3 times lower than for detection of a grating when the pedestal is absent. This is the ‘dipper effect’ – a reliable improvement whose interpretation remains controversial. Analogies between luminance and depth (disparity) processing have attracted interest in the existence of a ‘disparity dipper’. Are thresholds for disparity modulation (corrugated surfaces), facilitated by the presence of a weak disparity-modulated pedestal? We used a 14-bit greyscale to render small disparities accurately, and measured 2AFC discrimination thresholds for disparity modulation (0.3 or 0.6 c/deg) of a random texture at various pedestal levels. In the first experiment, a clear dipper was found. Thresholds were about 2× lower with weak pedestals than without. But here the phase of modulation (0 or 180 deg) was varied from trial to trial. In a noisy signal-detection framework, this creates uncertainty that is reduced by the pedestal, which thus improves performance. When the uncertainty was eliminated by keeping phase constant within sessions, the dipper effect was weak or absent. Monte Carlo simulations showed that the influence of uncertainty could account well for the results of both experiments. A corollary is that the visual depth response to small disparities is probably linear, with no threshold-like nonlinearity.
Resumo:
Measurement of detection and discrimination thresholds yields information about visual signal processing. For luminance contrast, we are 2 - 3 times more sensitive to a small increase in the contrast of a weak 'pedestal' grating, than when the pedestal is absent. This is the 'dipper effect' - a reliable improvement whose interpretation remains controversial. Analogies between luminance and depth (disparity) processing have attracted interest in the existence of a 'disparity dipper' - are thresholds for disparity, or disparity modulation (corrugated surfaces), facilitated by the presence of a weak pedestal? Lunn and Morgan (1997 Journal of the Optical Society of America A 14 360 - 371) found no dipper for disparity-modulated gratings, but technical limitations (8-bit greyscale) might have prevented the necessary measurement of very small disparity thresholds. We used a true 14-bit greyscale to render small disparities accurately, and measured 2AFC discrimination thresholds for disparity modulation (0.6 cycle deg-1) of a random texture at various pedestal levels. Which interval contained greater modulation of depth? In the first experiment, a clear dipper was found. Thresholds were about 2X1 lower with weak pedestals than without. But here the phase of modulation (0° or 180°) was randomised from trial to trial. In a noisy signal-detection framework, this creates uncertainty that is reduced by the pedestal, thus improving performance. When the uncertainty was eliminated by keeping phase constant within sessions, the dipper effect disappeared, confirming Lunn and Morgan's result. The absence of a dipper, coupled with shallow psychometric slopes, suggests that the visual response to small disparities is essentially linear, with no threshold-like nonlinearity.
Resumo:
Light occlusions are one of the most significant difficulties of photometric stereo methods. When three or more images are available without occlusion, the local surface orientation is overdetermined so that shape can be computed and the shadowed pixels can be discarded. In this paper, we look at the challenging case when only two images are available without occlusion, leading to a one degree of freedom ambiguity per pixel in the local orientation. We show that, in the presence of noise, integrability alone cannot resolve this ambiguity and reconstruct the geometry in the shadowed regions. As the problem is ill-posed in the presence of noise, we describe two regularization schemes that improve the numerical performance of the algorithm while preserving the data. Finally, the paper describes how this theory applies in the framework of color photometric stereo where one is restricted to only three images and light occlusions are common. Experiments on synthetic and real image sequences are presented.
Resumo:
This paper addresses the problem of obtaining complete, detailed reconstructions of textureless shiny objects. We present an algorithm which uses silhouettes of the object, as well as images obtained under changing illumination conditions. In contrast with previous photometric stereo techniques, ours is not limited to a single viewpoint but produces accurate reconstructions in full 3D. A number of images of the object are obtained from multiple viewpoints, under varying lighting conditions. Starting from the silhouettes, the algorithm recovers camera motion and constructs the object's visual hull. This is then used to recover the illumination and initialize a multiview photometric stereo scheme to obtain a closed surface reconstruction. There are two main contributions in this paper: First, we describe a robust technique to estimate light directions and intensities and, second, we introduce a novel formulation of photometric stereo which combines multiple viewpoints and, hence, allows closed surface reconstructions. The algorithm has been implemented as a practical model acquisition system. Here, a quantitative evaluation of the algorithm on synthetic data is presented together with complete reconstructions of challenging real objects. Finally, we show experimentally how, even in the case of highly textured objects, this technique can greatly improve on correspondence-based multiview stereo results.
Resumo:
This paper addresses the problem of obtaining 3d detailed reconstructions of human faces in real-time and with inexpensive hardware. We present an algorithm based on a monocular multi-spectral photometric-stereo setup. This system is known to capture high-detailed deforming 3d surfaces at high frame rates and without having to use any expensive hardware or synchronized light stage. However, the main challenge of such a setup is the calibration stage, which depends on the lights setup and how they interact with the specific material being captured, in this case, human faces. For this purpose we develop a self-calibration technique where the person being captured is asked to perform a rigid motion in front of the camera, maintaining a neutral expression. Rigidity constrains are then used to compute the head's motion with a structure-from-motion algorithm. Once the motion is obtained, a multi-view stereo algorithm reconstructs a coarse 3d model of the face. This coarse model is then used to estimate the lighting parameters with a stratified approach: In the first step we use a RANSAC search to identify purely diffuse points on the face and to simultaneously estimate this diffuse reflectance model. In the second step we apply non-linear optimization to fit a non-Lambertian reflectance model to the outliers of the previous step. The calibration procedure is validated with synthetic and real data.
Resumo:
We investigate the problem of obtaining a dense reconstruction in real-time, from a live video stream. In recent years, multi-view stereo (MVS) has received considerable attention and a number of methods have been proposed. However, most methods operate under the assumption of a relatively sparse set of still images as input and unlimited computation time. Video based MVS has received less attention despite the fact that video sequences offer significant benefits in terms of usability of MVS systems. In this paper we propose a novel video based MVS algorithm that is suitable for real-time, interactive 3d modeling with a hand-held camera. The key idea is a per-pixel, probabilistic depth estimation scheme that updates posterior depth distributions with every new frame. The current implementation is capable of updating 15 million distributions/s. We evaluate the proposed method against the state-of-the-art real-time MVS method and show improvement in terms of accuracy. © 2011 Elsevier B.V. All rights reserved.
Resumo:
Acquiring 3D shape from images is a classic problem in Computer Vision occupying researchers for at least 20 years. Only recently however have these ideas matured enough to provide highly accurate results. We present a complete algorithm to reconstruct 3D objects from images using the stereo correspondence cue. The technique can be described as a pipeline of four basic building blocks: camera calibration, image segmentation, photo-consistency estimation from images, and surface extraction from photo-consistency. In this Chapter we will put more emphasis on the latter two: namely how to extract geometric information from a set of photographs without explicit camera visibility, and how to combine different geometry estimates in an optimal way. © 2010 Springer-Verlag Berlin Heidelberg.
Resumo:
Photometric Stereo is a powerful image based 3D reconstruction technique that has recently been used to obtain very high quality reconstructions. However, in its classic form, Photometric Stereo suffers from two main limitations: Firstly, one needs to obtain images of the 3D scene under multiple different illuminations. As a result the 3D scene needs to remain static during illumination changes, which prohibits the reconstruction of deforming objects. Secondly, the images obtained must be from a single viewpoint. This leads to depth-map based 2.5 reconstructions, instead of full 3D surfaces. The aim of this Chapter is to show how these limitations can be alleviated, leading to the derivation of two practical 3D acquisition systems: The first one, based on the powerful Coloured Light Photometric Stereo method can be used to reconstruct moving objects such as cloth or human faces. The second, permits the complete 3D reconstruction of challenging objects such as porcelain vases. In addition to algorithmic details, the Chapter pays attention to practical issues such as setup calibration, detection and correction of self and cast shadows. We provide several evaluation experiments as well as reconstruction results. © 2010 Springer-Verlag Berlin Heidelberg.
Resumo:
Ricavare informazioni dalla realtà circostante è un obiettivo molto importante dell'informatica moderna, in modo da poter progettare robot, veicoli a guida autonoma, sistemi di riconoscimento e tanto altro. La computer vision è la parte dell'informatica che se ne occupa e sta sempre più prendendo piede. Per raggiungere tale obiettivo si utilizza una pipeline di visione stereo i cui passi di rettificazione e generazione di mappa di disparità sono oggetto di questa tesi. In particolare visto che questi passi sono spesso affidati a dispositivi hardware dedicati (come le FPGA) allora si ha la necessità di utilizzare algoritmi che siano portabili su questo tipo di tecnologia, dove le risorse sono molto minori. Questa tesi mostra come sia possibile utilizzare tecniche di approssimazione di questi algoritmi in modo da risparmiare risorse ma che che garantiscano comunque ottimi risultati.
Resumo:
Estereopsia define-se como a perceção de profundidade baseada na disparidade retiniana. A estereopsia global depende do processamento de estímulos de pontos aleatórios e a estereopsia local depende da perceção de contornos. O objetivo deste estudo é correlacionar três testes de estereopsia: TNO®, StereoTAB® e Fly Stereo Acuity Test® e verificar a sensibilidade e correlação entre eles, tendo o TNO® como gold standard. Incluíram-se 49 estudantes da Escola Superior de Tecnologia da Saúde de Lisboa (ESTeSL) entre os 18 e 26 anos. As variáveis ponto próximo de convergência (ppc), vergências, sintomatologia e correção ótica foram correlacionadas com os três testes. Os valores médios (desvios-padrão) de estereopsia foram: TNO® = 87,04’’ ±84,09’’; FlyTest® = 38,18’’ ±34,59’’; StereoTAB® = 124,89’’ ±137,38’’. Coeficiente de determinação: TNO® e StereoTAB® com R2=0,6 e TNO® e FlyTest® com R2=0,2. O coeficiente de correlação de Pearson mostra uma correlação positiva de entre o TNO® e o StereoTAB® (r=0,784 com α=0,01). O coeficiente de associação de Phi mostrou uma relação positiva forte entre o TNO® e StereoTAB® (Φ=0,848 com α=0,01). Na curva ROC, o StereoTAB® possui uma área sob a curva maior que o FlyTest®, apresentando valor de sensibilidade de 92,3% para uma especificidade de 94,4%, tornando-o num teste sensível e com bom poder discriminativo.