998 resultados para RGB-D data


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The recent emergence of low-cost RGB-D sensors has brought new opportunities for robotics by providing affordable devices that can provide synchronized images with both color and depth information. In this thesis, recent work on pose estimation utilizing RGBD sensors is reviewed. Also, a pose recognition system for rigid objects using RGB-D data is implemented. The implementation uses half-edge primitives extracted from the RGB-D images for pose estimation. The system is based on the probabilistic object representation framework by Detry et al., which utilizes Nonparametric Belief Propagation for pose inference. Experiments are performed on household objects to evaluate the performance and robustness of the system.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis investigates interactive scene reconstruction and understanding using RGB-D data only. Indeed, we believe that depth cameras will still be in the near future a cheap and low-power 3D sensing alternative suitable for mobile devices too. Therefore, our contributions build on top of state-of-the-art approaches to achieve advances in three main challenging scenarios, namely mobile mapping, large scale surface reconstruction and semantic modeling. First, we will describe an effective approach dealing with Simultaneous Localization And Mapping (SLAM) on platforms with limited resources, such as a tablet device. Unlike previous methods, dense reconstruction is achieved by reprojection of RGB-D frames, while local consistency is maintained by deploying relative bundle adjustment principles. We will show quantitative results comparing our technique to the state-of-the-art as well as detailed reconstruction of various environments ranging from rooms to small apartments. Then, we will address large scale surface modeling from depth maps exploiting parallel GPU computing. We will develop a real-time camera tracking method based on the popular KinectFusion system and an online surface alignment technique capable of counteracting drift errors and closing small loops. We will show very high quality meshes outperforming existing methods on publicly available datasets as well as on data recorded with our RGB-D camera even in complete darkness. Finally, we will move to our Semantic Bundle Adjustment framework to effectively combine object detection and SLAM in a unified system. Though the mathematical framework we will describe does not restrict to a particular sensing technology, in the experimental section we will refer, again, only to RGB-D sensing. We will discuss successful implementations of our algorithm showing the benefit of a joint object detection, camera tracking and environment mapping.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this project, we propose the implementation of a 3D object recognition system which will be optimized to operate under demanding time constraints. The system must be robust so that objects can be recognized properly in poor light conditions and cluttered scenes with significant levels of occlusion. An important requirement must be met: the system must exhibit a reasonable performance running on a low power consumption mobile GPU computing platform (NVIDIA Jetson TK1) so that it can be integrated in mobile robotics systems, ambient intelligence or ambient assisted living applications. The acquisition system is based on the use of color and depth (RGB-D) data streams provided by low-cost 3D sensors like Microsoft Kinect or PrimeSense Carmine. The range of algorithms and applications to be implemented and integrated will be quite broad, ranging from the acquisition, outlier removal or filtering of the input data and the segmentation or characterization of regions of interest in the scene to the very object recognition and pose estimation. Furthermore, in order to validate the proposed system, we will create a 3D object dataset. It will be composed by a set of 3D models, reconstructed from common household objects, as well as a handful of test scenes in which those objects appear. The scenes will be characterized by different levels of occlusion, diverse distances from the elements to the sensor and variations on the pose of the target objects. The creation of this dataset implies the additional development of 3D data acquisition and 3D object reconstruction applications. The resulting system has many possible applications, ranging from mobile robot navigation and semantic scene labeling to human-computer interaction (HCI) systems based on visual information.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Low cost RGB-D cameras such as the Microsoft’s Kinect or the Asus’s Xtion Pro are completely changing the computer vision world, as they are being successfully used in several applications and research areas. Depth data are particularly attractive and suitable for applications based on moving objects detection through foreground/background segmentation approaches; the RGB-D applications proposed in literature employ, in general, state of the art foreground/background segmentation techniques based on the depth information without taking into account the color information. The novel approach that we propose is based on a combination of classifiers that allows improving background subtraction accuracy with respect to state of the art algorithms by jointly considering color and depth data. In particular, the combination of classifiers is based on a weighted average that allows to adaptively modifying the support of each classifier in the ensemble by considering foreground detections in the previous frames and the depth and color edges. In this way, it is possible to reduce false detections due to critical issues that can not be tackled by the individual classifiers such as: shadows and illumination changes, color and depth camouflage, moved background objects and noisy depth measurements. Moreover, we propose, for the best of the author’s knowledge, the first publicly available RGB-D benchmark dataset with hand-labeled ground truth of several challenging scenarios to test background/foreground segmentation algorithms.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Image Based Visual Servoing (IBVS) is a robotic control scheme based on vision. This scheme uses only the visual information obtained from a camera to guide a robot from any robot pose to a desired one. However, IBVS requires the estimation of different parameters that cannot be obtained directly from the image. These parameters range from the intrinsic camera parameters (which can be obtained from a previous camera calibration), to the measured distance on the optical axis between the camera and visual features, it is the depth. This paper presents a comparative study of the performance of D-IBVS estimating the depth from three different ways using a low cost RGB-D sensor like Kinect. The visual servoing system has been developed over ROS (Robot Operating System), which is a meta-operating system for robots. The experiments prove that the computation of the depth value for each visual feature improves the system performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

For general home monitoring, a system should automatically interpret people’s actions. The system should be non-intrusive, and able to deal with a cluttered background, and loose clothes. An approach based on spatio-temporal local features and a Bag-of-Words (BoW) model is proposed for single-person action recognition from combined intensity and depth images. To restore the temporal structure lost in the traditional BoW method, a dynamic time alignment technique with temporal binning is applied in this work, which has not been previously implemented in the literature for human action recognition on depth imagery. A novel human action dataset with depth data has been created using two Microsoft Kinect sensors. The ReadingAct dataset contains 20 subjects and 19 actions for a total of 2340 videos. To investigate the effect of using depth images and the proposed method, testing was conducted on three depth datasets, and the proposed method was compared to traditional Bag-of-Words methods. Results showed that the proposed method improves recognition accuracy when adding depth to the conventional intensity data, and has advantages when dealing with long actions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

An innovative background modeling technique that is able to accurately segment foreground regions in RGB-D imagery (RGB plus depth) has been presented in this paper. The technique is based on a Bayesian framework that efficiently fuses different sources of information to segment the foreground. In particular, the final segmentation is obtained by considering a prediction of the foreground regions, carried out by a novel Bayesian Network with a depth-based dynamic model, and, by considering two independent depth and color-based mixture of Gaussians background models. The efficient Bayesian combination of all these data reduces the noise and uncertainties introduced by the color and depth features and the corresponding models. As a result, more compact segmentations, and refined foreground object silhouettes are obtained. Experimental results with different databases suggest that the proposed technique outperforms existing state-of-the-art algorithms.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a method for the fast calculation of a robot’s egomotion using visual features. The method is part of a complete system for automatic map building and Simultaneous Location and Mapping (SLAM). The method uses optical flow to determine whether the robot has undergone a movement. If so, some visual features that do not satisfy several criteria are deleted, and then egomotion is calculated. Thus, the proposed method improves the efficiency of the whole process because not all the data is processed. We use a state-of-the-art algorithm (TORO) to rectify the map and solve the SLAM problem. Additionally, a study of different visual detectors and descriptors has been conducted to identify which of them are more suitable for the SLAM problem. Finally, a navigation method is described using the map obtained from the SLAM solution.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Registration of point clouds captured by depth sensors is an important task in 3D reconstruction applications based on computer vision. In many applications with strict performance requirements, the registration should be executed not only with precision, but also in the same frequency as data is acquired by the sensor. This thesis proposes theuse of the pyramidal sparse optical flow algorithm to incrementally register point clouds captured by RGB-D sensors (e.g. Microsoft Kinect) in real time. The accumulated errorinherent to the process is posteriorly minimized by utilizing a marker and pose graph optimization. Experimental results gathered by processing several RGB-D datasets validatethe system proposed by this thesis in visual odometry and simultaneous localization and mapping (SLAM) applications.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this work, image based estimation methods, also known as direct methods, are studied which avoid feature extraction and matching completely. Cost functions use raw pixels as measurements and the goal is to produce precise 3D pose and structure estimates. The cost functions presented minimize the sensor error, because measurements are not transformed or modified. In photometric camera pose estimation, 3D rotation and translation parameters are estimated by minimizing a sequence of image based cost functions, which are non-linear due to perspective projection and lens distortion. In image based structure refinement, on the other hand, 3D structure is refined using a number of additional views and an image based cost metric. Image based estimation methods are particularly useful in conditions where the Lambertian assumption holds, and the 3D points have constant color despite viewing angle. The goal is to improve image based estimation methods, and to produce computationally efficient methods which can be accomodated into real-time applications. The developed image-based 3D pose and structure estimation methods are finally demonstrated in practise in indoor 3D reconstruction use, and in a live augmented reality application.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

[ES]Las tecnologías principales que se han utilizado son la visión por computador y los sensores de rango, es decir, las características visuales y la profundidad. Sin embargo, la aparición de sensores RGBD más asequibles, como Kinect, permite su aplicación en estos escenarios. Se aborda la utilización en entornos de interior de sensores RGBD para escenarios donde las condiciones de iluminación pueden ser variables. Se adopta una configuración cenital en el acceso a un espacio, para preservar la privacidad y facilitar la detección y seguimiento de los objetos salientes que aparecen en el escenario mediante técnicas de sustracción de fondo. Los objetos detectados son modelados, pudiendo ser descritos según las características de apariencia y geométricas como el área y volumen.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

[EN]The re-identification problem has been commonly accomplished using appearance features based on salient points and color information. In this paper, we focus on the possibilities that simple geometric features obtained from depth images captured with RGB-D cameras may offer for the task, particularly working under severe illumination conditions. The results achieved for different sets of simple geometric features extracted in a top-view setup seem to provide useful descriptors for the re-identification task, which can be integrated in an ambient intelligent environment as part of a sensor network.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

[EN]Re-identi fication is commonly accomplished using appearance features based on salient points and color information. In this paper, we make an study on the use of di fferent features exclusively obtained from depth images captured with RGB-D cameras. The results achieved, using simple geometric features extracted in a top-view setup, seem to provide useful descriptors for the re-identi fication task.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Viene proposto un porting su piattaforma mobile Android di un sistema SLAM (Simultaneous Localization And Mapping) chiamato SlamDunk. Il porting affronta problematiche di prestazioni e qualità delle ricostruzioni 3D ottenute, proponendo poi la soluzione ritenuta ottimale.