980 resultados para Stereo Vision


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Humans can perceive three dimension, our world is three dimensional and it is becoming increasingly digital too. We have the need to capture and preserve our existence in digital means perhaps due to our own mortality. We have also the need to reproduce objects or create small identical objects to prototype, test or study them. Some objects have been lost through time and are only accessible through old photographs. With robust model generation from photographs we can use one of the biggest human data sets and reproduce real world objects digitally and physically with printers. What is the current state of development in three dimensional reconstruction through photographs both in the commercial world and in the open source world? And what tools are available for a developer to build his own reconstruction software? To answer these questions several pieces of software were tested, from full commercial software packages to open source small projects, including libraries aimed at computer vision. To bring to the real world the 3D models a 3D printer was built, tested and analyzed, its problems and weaknesses evaluated. Lastly using a computer vision library a small software with limited capabilities was developed.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Large efforts have been maden by the scientific community on tasks involving locomotion of mobile robots. To execute this kind of task, we must develop to the robot the ability of navigation through the environment in a safe way, that is, without collisions with the objects. In order to perform this, it is necessary to implement strategies that makes possible to detect obstacles. In this work, we deal with this problem by proposing a system that is able to collect sensory information and to estimate the possibility for obstacles to occur in the mobile robot path. Stereo cameras positioned in parallel to each other in a structure coupled to the robot are employed as the main sensory device, making possible the generation of a disparity map. Code optimizations and a strategy for data reduction and abstraction are applied to the images, resulting in a substantial gain in the execution time. This makes possible to the high level decision processes to execute obstacle deviation in real time. This system can be employed in situations where the robot is remotely operated, as well as in situations where it depends only on itself to generate trajectories (the autonomous case)

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This work introduces a new method for environment mapping with three-dimensional information from visual information for robotic accurate navigation. Many approaches of 3D mapping using occupancy grid typically requires high computacional effort to both build and store the map. We introduce an 2.5-D occupancy-elevation grid mapping, which is a discrete mapping approach, where each cell stores the occupancy probability, the height of the terrain at current place in the environment and the variance of this height. This 2.5-dimensional representation allows that a mobile robot to know whether a place in the environment is occupied by an obstacle and the height of this obstacle, thus, it can decide if is possible to traverse the obstacle. Sensorial informations necessary to construct the map is provided by a stereo vision system, which has been modeled with a robust probabilistic approach, considering the noise present in the stereo processing. The resulting maps favors the execution of tasks like decision making in the autonomous navigation, exploration, localization and path planning. Experiments carried out with a real mobile robots demonstrates that this proposed approach yields useful maps for robot autonomous navigation

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This work proposes a method to determine the depth of objects in a scene using a combination between stereo vision and self-calibration techniques. Determining the rel- ative distance between visualized objects and a robot, with a stereo head, it is possible to navigate in unknown environments. Stereo vision techniques supply a depth measure by the combination of two or more images from the same scene. To achieve a depth estimates of the in scene objects a reconstruction of this scene geometry is necessary. For such reconstruction the relationship between the three-dimensional world coordi- nates and the two-dimensional images coordinates is necessary. Through the achievement of the cameras intrinsic parameters it is possible to make this coordinates systems relationship. These parameters can be gotten through geometric camera calibration, which, generally is made by a correlation between image characteristics of a calibration pattern with know dimensions. The cameras self-calibration allows the achievement of their intrinsic parameters without using a known calibration pattern, being possible their calculation and alteration during the displacement of the robot in an unknown environment. In this work a self-calibration method based in the three-dimensional polar coordinates to represent image features is presented. This representation is determined by the relationship between images features and horizontal and vertical opening cameras angles. Using the polar coordinates it is possible to geometrically reconstruct the scene. Through the proposed techniques combination it is possible to calculate a scene objects depth estimate, allowing the robot navigation in an unknown environment

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This study aims to seek a more viable alternative for the calculation of differences in images of stereo vision, using a factor that reduces heel the amount of points that are considered on the captured image, and a network neural-based radial basis functions to interpolate the results. The objective to be achieved is to produce an approximate picture of disparities using algorithms with low computational cost, unlike the classical algorithms

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Pós-graduação em Ciência da Computação - IBILCE

Relevância:

60.00% 60.00%

Publicador:

Resumo:

[EN] In this paper we present some real problems which appear in computer vision which yields to nonlinear system of algebraic equations. We study the problem of camera calibration. Roughly speaking camera calibration consists in looking at the camera position in the 3- D world using as information the projection of a 3- D Scene in a 2-D plane (the photogram). The problem is quite different when we use a single view or several views (stereo vision) of the 3-D scene. We will show in this paper how these problems yields to nonlinear algebraic system of equations.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Visual correspondence is a key computer vision task that aims at identifying projections of the same 3D point into images taken either from different viewpoints or at different time instances. This task has been the subject of intense research activities in the last years in scenarios such as object recognition, motion detection, stereo vision, pattern matching, image registration. The approaches proposed in literature typically aim at improving the state of the art by increasing the reliability, the accuracy or the computational efficiency of visual correspondence algorithms. The research work carried out during the Ph.D. course and presented in this dissertation deals with three specific visual correspondence problems: fast pattern matching, stereo correspondence and robust image matching. The dissertation presents original contributions to the theory of visual correspondence, as well as applications dealing with 3D reconstruction and multi-view video surveillance.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We present an algorithm for estimating dense image correspondences. Our versatile approach lends itself to various tasks typical for video post-processing, including image morphing, optical flow estimation, stereo rectification, disparity/depth reconstruction, and baseline adjustment. We incorporate recent advances in feature matching, energy minimization, stereo vision, and data clustering into our approach. At the core of our correspondence estimation we use Efficient Belief Propagation for energy minimization. While state-of-the-art algorithms only work on thumbnail-sized images, our novel feature downsampling scheme in combination with a simple, yet efficient data term compression, can cope with high-resolution data. The incorporation of SIFT (Scale-Invariant Feature Transform) features into data term computation further resolves matching ambiguities, making long-range correspondence estimation possible. We detect occluded areas by evaluating the correspondence symmetry, we further apply Geodesic matting to automatically determine plausible values in these regions.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Several recent works deal with 3D data in mobile robotic problems, e.g. mapping or egomotion. Data comes from any kind of sensor such as stereo vision systems, time of flight cameras or 3D lasers, providing a huge amount of unorganized 3D data. In this paper, we describe an efficient method to build complete 3D models from a Growing Neural Gas (GNG). The GNG is applied to the 3D raw data and it reduces both the subjacent error and the number of points, keeping the topology of the 3D data. The GNG output is then used in a 3D feature extraction method. We have performed a deep study in which we quantitatively show that the use of GNG improves the 3D feature extraction method. We also show that our method can be applied to any kind of 3D data. The 3D features obtained are used as input in an Iterative Closest Point (ICP)-like method to compute the 6DoF movement performed by a mobile robot. A comparison with standard ICP is performed, showing that the use of GNG improves the results. Final results of 3D mapping from the egomotion calculated are also shown.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A reliable perception of the real world is a key-feature for an autonomous vehicle and the Advanced Driver Assistance Systems (ADAS). Obstacles detection (OD) is one of the main components for the correct reconstruction of the dynamic world. Historical approaches based on stereo vision and other 3D perception technologies (e.g. LIDAR) have been adapted to the ADAS first and autonomous ground vehicles, after, providing excellent results. The obstacles detection is a very broad field and this domain counts a lot of works in the last years. In academic research, it has been clearly established the essential role of these systems to realize active safety systems for accident prevention, reflecting also the innovative systems introduced by industry. These systems need to accurately assess situational criticalities and simultaneously assess awareness of these criticalities by the driver; it requires that the obstacles detection algorithms must be reliable and accurate, providing: a real-time output, a stable and robust representation of the environment and an estimation independent from lighting and weather conditions. Initial systems relied on only one exteroceptive sensor (e.g. radar or laser for ACC and camera for LDW) in addition to proprioceptive sensors such as wheel speed and yaw rate sensors. But, current systems, such as ACC operating at the entire speed range or autonomous braking for collision avoidance, require the use of multiple sensors since individually they can not meet these requirements. It has led the community to move towards the use of a combination of them in order to exploit the benefits of each one. Pedestrians and vehicles detection are ones of the major thrusts in situational criticalities assessment, still remaining an active area of research. ADASs are the most prominent use case of pedestrians and vehicles detection. Vehicles should be equipped with sensing capabilities able to detect and act on objects in dangerous situations, where the driver would not be able to avoid a collision. A full ADAS or autonomous vehicle, with regard to pedestrians and vehicles, would not only include detection but also tracking, orientation, intent analysis, and collision prediction. The system detects obstacles using a probabilistic occupancy grid built from a multi-resolution disparity map. Obstacles classification is based on an AdaBoost SoftCascade trained on Aggregate Channel Features. A final stage of tracking and fusion guarantees stability and robustness to the result.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

L’automatisation de la détection et de l’identification des animaux est une tâche qui a de l’intérêt dans plusieurs domaines de recherche en biologie ainsi que dans le développement de systèmes de surveillance électronique. L’auteur présente un système de détection et d’identification basé sur la vision stéréo par ordinateur. Plusieurs critères sont utilisés pour identifier les animaux, mais l’accent a été mis sur l’analyse harmonique de la reconstruction en temps réel de la forme en 3D des animaux. Le résultat de l’analyse est comparé avec d’autres qui sont contenus dans une base évolutive de connaissances.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This thesis proposes a generic visual perception architecture for robotic clothes perception and manipulation. This proposed architecture is fully integrated with a stereo vision system and a dual-arm robot and is able to perform a number of autonomous laundering tasks. Clothes perception and manipulation is a novel research topic in robotics and has experienced rapid development in recent years. Compared to the task of perceiving and manipulating rigid objects, clothes perception and manipulation poses a greater challenge. This can be attributed to two reasons: firstly, deformable clothing requires precise (high-acuity) visual perception and dexterous manipulation; secondly, as clothing approximates a non-rigid 2-manifold in 3-space, that can adopt a quasi-infinite configuration space, the potential variability in the appearance of clothing items makes them difficult to understand, identify uniquely, and interact with by machine. From an applications perspective, and as part of EU CloPeMa project, the integrated visual perception architecture refines a pre-existing clothing manipulation pipeline by completing pre-wash clothes (category) sorting (using single-shot or interactive perception for garment categorisation and manipulation) and post-wash dual-arm flattening. To the best of the author’s knowledge, as investigated in this thesis, the autonomous clothing perception and manipulation solutions presented here were first proposed and reported by the author. All of the reported robot demonstrations in this work follow a perception-manipulation method- ology where visual and tactile feedback (in the form of surface wrinkledness captured by the high accuracy depth sensor i.e. CloPeMa stereo head or the predictive confidence modelled by Gaussian Processing) serve as the halting criteria in the flattening and sorting tasks, respectively. From scientific perspective, the proposed visual perception architecture addresses the above challenges by parsing and grouping 3D clothing configurations hierarchically from low-level curvatures, through mid-level surface shape representations (providing topological descriptions and 3D texture representations), to high-level semantic structures and statistical descriptions. A range of visual features such as Shape Index, Surface Topologies Analysis and Local Binary Patterns have been adapted within this work to parse clothing surfaces and textures and several novel features have been devised, including B-Spline Patches with Locality-Constrained Linear coding, and Topology Spatial Distance to describe and quantify generic landmarks (wrinkles and folds). The essence of this proposed architecture comprises 3D generic surface parsing and interpretation, which is critical to underpinning a number of laundering tasks and has the potential to be extended to other rigid and non-rigid object perception and manipulation tasks. The experimental results presented in this thesis demonstrate that: firstly, the proposed grasp- ing approach achieves on-average 84.7% accuracy; secondly, the proposed flattening approach is able to flatten towels, t-shirts and pants (shorts) within 9 iterations on-average; thirdly, the proposed clothes recognition pipeline can recognise clothes categories from highly wrinkled configurations and advances the state-of-the-art by 36% in terms of classification accuracy, achieving an 83.2% true-positive classification rate when discriminating between five categories of clothes; finally the Gaussian Process based interactive perception approach exhibits a substantial improvement over single-shot perception. Accordingly, this thesis has advanced the state-of-the-art of robot clothes perception and manipulation.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Combination of signals from the two eyes is the gateway to stereo vision. To gain insight into binocular signal processing, we studied binocular summation for luminance-modulated gratings (L or LM) and contrast-modulated gratings (CM). We measured 2AFC detection thresholds for a signal grating (0.75 c/deg, 216msec) shown to one eye, both eyes, or both eyes out-of-phase. For LM and CM, the carrier noise was in both eyes, even when the signal was monocular. Mean binocular thresholds for luminance gratings (L) were 5.4dB better than monocular thresholds - close to perfect linear summation (6dB). For LM and CM the binocular advantage was again 5-6dB, even when the carrier noise was uncorrelated, anti-correlated, or at orthogonal orientations in the two eyes. Binocular combination for CM probably arises from summation of envelope responses, and not from summation of these conflicting carrier patterns. Antiphase signals produced no binocular advantage, but thresholds were about 1-3dB higher than monocular ones. This is not consistent with simple linear summation, which should give complete cancellation and unmeasurably high thresholds. We propose a three-channel model in which noisy monocular responses to the envelope are binocularly combined in a contrast-weighted sum, but also remain separately available to perception via a max operator. Vision selects the largest of the three responses. With in-phase gratings the binocular channel dominates, but antiphase gratings cancel in the binocular channel and the monocular channels mediate detection. The small antiphase disadvantage might be explained by a subtle influence of background responses on binocular and monocular detection.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

PURPOSE To investigate the cortical mechanisms that prevent diplopia in intermittent exotropia (X(T)) during binocular alignment (orthotropia). METHODS The authors studied 12 X(T) patients aged 5 to 22 years. Seventy-five percent had functional stereo vision with stereoacuity similar to that of 12 age-matched controls (0.2-3.7 min arc). Identical face images were presented to the two eyes for 400 ms. In one eye, the face was presented at the fovea; in the other, offset along the horizontal axis with up to 12° eccentricity. The task was to indicate whether one or two faces were perceived. RESULTS All X(T) patients showed normal diplopia when the nonfoveal face was presented to nasal hemiretina, though with a slightly larger fusional range than age-matched controls. However, 10 of 12 patients never experienced diplopia when the nonfoveal face was presented to temporal hemiretina (i.e., when the stimulus simulated exodeviation). Patients showed considerable variability when the single image was perceived. Some patients suppressed the temporal stimulus regardless of which eye viewed it, whereas others suppressed a particular eye even when it viewed the foveal stimulus. In two patients, the simulated exodeviation might have triggered a shift from normal to anomalous retinal correspondence. CONCLUSIONS Antidiplopic mechanisms in X(T) can be reliably triggered by purely retinal information during orthotropia, but the nature of these mechanisms varies between patients.