997 resultados para RGB-D image


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Nowadays, the use of RGB-D sensors have focused a lot of research in computer vision and robotics. These kinds of sensors, like Kinect, allow to obtain 3D data together with color information. However, their working range is limited to less than 10 meters, making them useless in some robotics applications, like outdoor mapping. In these environments, 3D lasers, working in ranges of 20-80 meters, are better. But 3D lasers do not usually provide color information. A simple 2D camera can be used to provide color information to the point cloud, but a calibration process between camera and laser must be done. In this paper we present a portable calibration system to calibrate any traditional camera with a 3D laser in order to assign color information to the 3D points obtained. Thus, we can use laser precision and simultaneously make use of color information. Unlike other techniques that make use of a three-dimensional body of known dimensions in the calibration process, this system is highly portable because it makes use of small catadioptrics that can be placed in a simple manner in the environment. We use our calibration system in a 3D mapping system, including Simultaneous Location and Mapping (SLAM), in order to get a 3D colored map which can be used in different tasks. We show that an additional problem arises: 2D cameras information is different when lighting conditions change. So when we merge 3D point clouds from two different views, several points in a given neighborhood could have different color information. A new method for color fusion is presented, obtaining correct colored maps. The system will be tested by applying it to 3D reconstruction.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Paper submitted to the 43rd International Symposium on Robotics (ISR2012), Taipei, Taiwan, Aug. 29-31, 2012.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Current RGB-D sensors provide a big amount of valuable information for mobile robotics tasks like 3D map reconstruction, but the storage and processing of the incremental data provided by the different sensors through time quickly become unmanageable. In this work, we focus on 3D maps representation and propose the use of the Growing Neural Gas (GNG) network as a model to represent 3D input data. GNG method is able to represent the input data with a desired amount of neurons or resolution while preserving the topology of the input space. Experiments show how GNG method yields a better input space adaptation than other state-of-the-art 3D map representation methods.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Many applications including object reconstruction, robot guidance, and. scene mapping require the registration of multiple views from a scene to generate a complete geometric and appearance model of it. In real situations, transformations between views are unknown and it is necessary to apply expert inference to estimate them. In the last few years, the emergence of low-cost depth-sensing cameras has strengthened the research on this topic, motivating a plethora of new applications. Although they have enough resolution and accuracy for many applications, some situations may not be solved with general state-of-the-art registration methods due to the signal-to-noise ratio (SNR) and the resolution of the data provided. The problem of working with low SNR data, in general terms, may appear in any 3D system, then it is necessary to propose novel solutions in this aspect. In this paper, we propose a method, μ-MAR, able to both coarse and fine register sets of 3D points provided by low-cost depth-sensing cameras, despite it is not restricted to these sensors, into a common coordinate system. The method is able to overcome the noisy data problem by means of using a model-based solution of multiplane registration. Specifically, it iteratively registers 3D markers composed by multiple planes extracted from points of multiple views of the scene. As the markers and the object of interest are static in the scenario, the transformations obtained for the markers are applied to the object in order to reconstruct it. Experiments have been performed using synthetic and real data. The synthetic data allows a qualitative and quantitative evaluation by means of visual inspection and Hausdorff distance respectively. The real data experiments show the performance of the proposal using data acquired by a Primesense Carmine RGB-D sensor. The method has been compared to several state-of-the-art methods. The results show the good performance of the μ-MAR to register objects with high accuracy in presence of noisy data outperforming the existing methods.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this thesis a methodology for representing 3D subjects and their deformations in adverse situations is studied. The study is focused in providing methods based on registration techniques to improve the data in situations where the sensor is working in the limit of its sensitivity. In order to do this, it is proposed two methods to overcome the problems which can difficult the process in these conditions. First a rigid registration based on model registration is presented, where the model of 3D planar markers is used. This model is estimated using a proposed method which improves its quality by taking into account prior knowledge of the marker. To study the deformations, it is proposed a framework to combine multiple spaces in a non-rigid registration technique. This proposal improves the quality of the alignment with a more robust matching process that makes use of all available input data. Moreover, this framework allows the registration of multiple spaces simultaneously providing a more general technique. Concretely, it is instantiated using colour and location in the matching process for 3D location registration.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Durante los últimos años ha sido creciente el uso de las unidades de procesamiento gráfico, más conocidas como GPU (Graphic Processing Unit), en aplicaciones de propósito general, dejando a un lado el objetivo para el que fueron creadas y que no era otro que el renderizado de gráficos por computador. Este crecimiento se debe en parte a la evolución que han experimentado estos dispositivos durante este tiempo y que les ha dotado de gran potencia de cálculo, consiguiendo que su uso se extienda desde ordenadores personales a grandes cluster. Este hecho unido a la proliferación de sensores RGB-D de bajo coste ha hecho que crezca el número de aplicaciones de visión que hacen uso de esta tecnología para la resolución de problemas, así como también para el desarrollo de nuevas aplicaciones. Todas estas mejoras no solamente se han realizado en la parte hardware, es decir en los dispositivos, sino también en la parte software con la aparición de nuevas herramientas de desarrollo que facilitan la programación de estos dispositivos GPU. Este nuevo paradigma se acuñó como Computación de Propósito General sobre Unidades de Proceso Gráfico (General-Purpose computation on Graphics Processing Units, GPGPU). Los dispositivos GPU se clasifican en diferentes familias, en función de las distintas características hardware que poseen. Cada nueva familia que aparece incorpora nuevas mejoras tecnológicas que le permite conseguir mejor rendimiento que las anteriores. No obstante, para sacar un rendimiento óptimo a un dispositivo GPU es necesario configurarlo correctamente antes de usarlo. Esta configuración viene determinada por los valores asignados a una serie de parámetros del dispositivo. Por tanto, muchas de las implementaciones que hoy en día hacen uso de los dispositivos GPU para el registro denso de nubes de puntos 3D, podrían ver mejorado su rendimiento con una configuración óptima de dichos parámetros, en función del dispositivo utilizado. Es por ello que, ante la falta de un estudio detallado del grado de afectación de los parámetros GPU sobre el rendimiento final de una implementación, se consideró muy conveniente la realización de este estudio. Este estudio no sólo se realizó con distintas configuraciones de parámetros GPU, sino también con diferentes arquitecturas de dispositivos GPU. El objetivo de este estudio es proporcionar una herramienta de decisión que ayude a los desarrolladores a la hora implementar aplicaciones para dispositivos GPU. Uno de los campos de investigación en los que más prolifera el uso de estas tecnologías es el campo de la robótica ya que tradicionalmente en robótica, sobre todo en la robótica móvil, se utilizaban combinaciones de sensores de distinta naturaleza con un alto coste económico, como el láser, el sónar o el sensor de contacto, para obtener datos del entorno. Más tarde, estos datos eran utilizados en aplicaciones de visión por computador con un coste computacional muy alto. Todo este coste, tanto el económico de los sensores utilizados como el coste computacional, se ha visto reducido notablemente gracias a estas nuevas tecnologías. Dentro de las aplicaciones de visión por computador más utilizadas está el registro de nubes de puntos. Este proceso es, en general, la transformación de diferentes nubes de puntos a un sistema de coordenadas conocido. Los datos pueden proceder de fotografías, de diferentes sensores, etc. Se utiliza en diferentes campos como son la visión artificial, la imagen médica, el reconocimiento de objetos y el análisis de imágenes y datos de satélites. El registro se utiliza para poder comparar o integrar los datos obtenidos en diferentes mediciones. En este trabajo se realiza un repaso del estado del arte de los métodos de registro 3D. Al mismo tiempo, se presenta un profundo estudio sobre el método de registro 3D más utilizado, Iterative Closest Point (ICP), y una de sus variantes más conocidas, Expectation-Maximization ICP (EMICP). Este estudio contempla tanto su implementación secuencial como su implementación paralela en dispositivos GPU, centrándose en cómo afectan a su rendimiento las distintas configuraciones de parámetros GPU. Como consecuencia de este estudio, también se presenta una propuesta para mejorar el aprovechamiento de la memoria de los dispositivos GPU, permitiendo el trabajo con nubes de puntos más grandes, reduciendo el problema de la limitación de memoria impuesta por el dispositivo. El funcionamiento de los métodos de registro 3D utilizados en este trabajo depende en gran medida de la inicialización del problema. En este caso, esa inicialización del problema consiste en la correcta elección de la matriz de transformación con la que se iniciará el algoritmo. Debido a que este aspecto es muy importante en este tipo de algoritmos, ya que de él depende llegar antes o no a la solución o, incluso, no llegar nunca a la solución, en este trabajo se presenta un estudio sobre el espacio de transformaciones con el objetivo de caracterizarlo y facilitar la elección de la transformación inicial a utilizar en estos algoritmos.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Since the beginning of 3D computer vision problems, the use of techniques to reduce the data to make it treatable preserving the important aspects of the scene has been necessary. Currently, with the new low-cost RGB-D sensors, which provide a stream of color and 3D data of approximately 30 frames per second, this is getting more relevance. Many applications make use of these sensors and need a preprocessing to downsample the data in order to either reduce the processing time or improve the data (e.g., reducing noise or enhancing the important features). In this paper, we present a comparison of different downsampling techniques which are based on different principles. Concretely, five different downsampling methods are included: a bilinear-based method, a normal-based, a color-based, a combination of the normal and color-based samplings, and a growing neural gas (GNG)-based approach. For the comparison, two different models have been used acquired with the Blensor software. Moreover, to evaluate the effect of the downsampling in a real application, a 3D non-rigid registration is performed with the data sampled. From the experimentation we can conclude that depending on the purpose of the application some kernels of the sampling methods can improve drastically the results. Bilinear- and GNG-based methods provide homogeneous point clouds, but color-based and normal-based provide datasets with higher density of points in areas with specific features. In the non-rigid application, if a color-based sampled point cloud is used, it is possible to properly register two datasets for cases where intensity data are relevant in the model and outperform the results if only a homogeneous sampling is used.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Atualmente os sistemas de pilotagem autónoma de quadricópteros estão a ser desenvolvidos de forma a efetuarem navegação em espaços exteriores, onde o sinal de GPS pode ser utilizado para definir waypoints de navegação, modos de position e altitude hold, returning home, entre outros. Contudo, o problema de navegação autónoma em espaços fechados sem que se utilize um sistema de posicionamento global dentro de uma sala, subsiste como um problema desafiante e sem solução fechada. Grande parte das soluções são baseadas em sensores dispendiosos, como o LIDAR ou como sistemas de posicionamento externos (p.ex. Vicon, Optitrack). Algumas destas soluções reservam a capacidade de processamento de dados dos sensores e dos algoritmos mais exigentes para sistemas de computação exteriores ao veículo, o que também retira a componente de autonomia total que se pretende num veículo com estas características. O objetivo desta tese pretende, assim, a preparação de um sistema aéreo não-tripulado de pequeno porte, nomeadamente um quadricóptero, que integre diferentes módulos que lhe permitam simultânea localização e mapeamento em espaços interiores onde o sinal GPS ´e negado, utilizando, para tal, uma câmara RGB-D, em conjunto com outros sensores internos e externos do quadricóptero, integrados num sistema que processa o posicionamento baseado em visão e com o qual se pretende que efectue, num futuro próximo, planeamento de movimento para navegação. O resultado deste trabalho foi uma arquitetura integrada para análise de módulos de localização, mapeamento e navegação, baseada em hardware aberto e barato e frameworks state-of-the-art disponíveis em código aberto. Foi também possível testar parcialmente alguns módulos de localização, sob certas condições de ensaio e certos parâmetros dos algoritmos. A capacidade de mapeamento da framework também foi testada e aprovada. A framework obtida encontra-se pronta para navegação, necessitando apenas de alguns ajustes e testes.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

To represent the local orientation and energy of a 1-D image signal, many models of early visual processing employ bandpass quadrature filters, formed by combining the original signal with its Hilbert transform. However, representations capable of estimating an image signal's 2-D phase have been largely ignored. Here, we consider 2-D phase representations using a method based upon the Riesz transform. For spatial images there exist two Riesz transformed signals and one original signal from which orientation, phase and energy may be represented as a vector in 3-D signal space. We show that these image properties may be represented by a Singular Value Decomposition (SVD) of the higher-order derivatives of the original and the Riesz transformed signals. We further show that the expected responses of even and odd symmetric filters from the Riesz transform may be represented by a single signal autocorrelation function, which is beneficial in simplifying Bayesian computations for spatial orientation. Importantly, the Riesz transform allows one to weight linearly across orientation using both symmetric and asymmetric filters to account for some perceptual phase distortions observed in image signals - notably one's perception of edge structure within plaid patterns whose component gratings are either equal or unequal in contrast. Finally, exploiting the benefits that arise from the Riesz definition of local energy as a scalar quantity, we demonstrate the utility of Riesz signal representations in estimating the spatial orientation of second-order image signals. We conclude that the Riesz transform may be employed as a general tool for 2-D visual pattern recognition by its virtue of representing phase, orientation and energy as orthogonal signal quantities.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Communication has become an essential function in our civilization. With the increasing demand for communication channels, it is now necessary to find ways to optimize the use of their bandwidth. One way to achieve this is by transforming the information before it is transmitted. This transformation can be performed by several techniques. One of the newest of these techniques is the use of wavelets. Wavelet transformation refers to the act of breaking down a signal into components called details and trends by using small waveforms that have a zero average in the time domain. After this transformation the data can be compressed by discarding the details, transmitting the trends. In the receiving end, the trends are used to reconstruct the image. In this work, the wavelet used for the transformation of an image will be selected from a library of available bases. The accuracy of the reconstruction, after the details are discarded, is dependent on the wavelets chosen from the wavelet basis library. The system developed in this thesis takes a 2-D image and decomposes it using a wavelet bank. A digital signal processor is used to achieve near real-time performance in this transformation task. A contribution of this thesis project is the development of DSP-based test bed for the future development of new real-time wavelet transformation algorithms.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Retrieval, treatment, and disposal of high-level radioactive waste (HLW) is expected to cost between 100 and 300 billion dollars. The risk to workers, public health, and the environment are also a major area of concern for HLW. Visualization of the interface between settled solids and the optically opaque liquid is needed for retrieval of the waste from underground storage tanks. A Profiling sonar selected for this research generates 2-D image of the interface. Multiple experiments were performed to demonstrate the effectiveness of sonar in real-time monitoring the interface inside HLW tanks. First set of experiments demonstrated that objects shapes could be identified even when 30% of solids entrained in liquid, thereby mapping the interface. Simulation of sonar system validated these results. Second set of experiments confirmed the sonar’s ability in detecting the solids with density similar to the immersed liquid. Third set of experiments determined the affects of near by objects on image resolution. Final set of experiments proved the functional and chemical capabilities of sonar in caustic solution.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Le cancer pulmonaire est la principale cause de décès parmi tous les cancers au Canada. Le pronostic est généralement faible, de l'ordre de 15% de taux de survie après 5 ans. Les déplacements internes des structures anatomiques apportent une incertitude sur la précision des traitements en radio-oncologie, ce qui diminue leur efficacité. Dans cette optique, certaines techniques comme la radio-chirurgie et la radiothérapie par modulation de l'intensité (IMRT) visent à améliorer les résultats cliniques en ciblant davantage la tumeur. Ceci permet d'augmenter la dose reçue par les tissus cancéreux et de réduire celle administrée aux tissus sains avoisinants. Ce projet vise à mieux évaluer la dose réelle reçue pendant un traitement considérant une anatomie en mouvement. Pour ce faire, des plans de CyberKnife et d'IMRT sont recalculés en utilisant un algorithme Monte Carlo 4D de transport de particules qui permet d'effectuer de l'accumulation de dose dans une géométrie déformable. Un environnement de simulation a été développé afin de modéliser ces deux modalités pour comparer les distributions de doses standard et 4D. Les déformations dans le patient sont obtenues en utilisant un algorithme de recalage déformable d'image (DIR) entre les différentes phases respiratoire générées par le scan CT 4D. Ceci permet de conserver une correspondance de voxels à voxels entre la géométrie de référence et celles déformées. La DIR est calculée en utilisant la suite ANTs («Advanced Normalization Tools») et est basée sur des difféomorphismes. Une version modifiée de DOSXYZnrc de la suite EGSnrc, defDOSXYZnrc, est utilisée pour le transport de particule en 4D. Les résultats sont comparés à une planification standard afin de valider le modèle actuel qui constitue une approximation par rapport à une vraie accumulation de dose en 4D.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper we present a convolutional neuralnetwork (CNN)-based model for human head pose estimation inlow-resolution multi-modal RGB-D data. We pose the problemas one of classification of human gazing direction. We furtherfine-tune a regressor based on the learned deep classifier. Next wecombine the two models (classification and regression) to estimateapproximate regression confidence. We present state-of-the-artresults in datasets that span the range of high-resolution humanrobot interaction (close up faces plus depth information) data tochallenging low resolution outdoor surveillance data. We buildupon our robust head-pose estimation and further introduce anew visual attention model to recover interaction with theenvironment. Using this probabilistic model, we show thatmany higher level scene understanding like human-human/sceneinteraction detection can be achieved. Our solution runs inreal-time on commercial hardware

Relevância:

80.00% 80.00%

Publicador:

Resumo:

L'imagerie par tomographie optique diffuse requiert de modéliser la propagation de la lumière dans un tissu biologique pour une configuration optique et géométrique donnée. On appelle cela le problème direct. Une nouvelle approche basée sur la méthode des différences finies pour modéliser numériquement via l'équation de la diffusion (ED) la propagation de la lumière dans le domaine temporel dans un milieu inhomogène 3D avec frontières irrégulières est développée pour le cas de l'imagerie intrinsèque, c'est-à-dire l'imagerie des paramètres optiques d'absorption et de diffusion d'un tissu. Les éléments finis, lourds en calculs, car utilisant des maillages non structurés, sont généralement préférés, car les différences finies ne permettent pas de prendre en compte simplement des frontières irrégulières. L'utilisation de la méthode de blocking-off ainsi que d'un filtre de Sobel en 3D peuvent en principe permettre de surmonter ces difficultés et d'obtenir des équations rapides à résoudre numériquement avec les différences finies. Un algorithme est développé dans le présent ouvrage pour implanter cette approche et l'appliquer dans divers cas puis de la valider en comparant les résultats obtenus à ceux de simulations Monte-Carlo qui servent de référence. L'objectif ultime du projet est de pouvoir imager en trois dimensions un petit animal, c'est pourquoi le modèle de propagation est au coeur de l'algorithme de reconstruction d'images. L'obtention d'images requière la résolution d'un problème inverse de grandes dimensions et l'algorithme est basé sur une fonction objective que l'on minimise de façon itérative à l'aide d'une méthode basée sur le gradient. La fonction objective mesure l'écart entre les mesures expérimentales faites sur le sujet et les prédictions de celles-ci obtenues du modèle de propagation. Une des difficultés dans ce type d'algorithme est l'obtention du gradient. Ceci est fait à l'aide de variables auxiliaire (ou adjointes). Le but est de développer et de combiner des méthodes qui permettent à l'algorithme de converger le plus rapidement possible pour obtenir les propriétés optiques les plus fidèles possible à la réalité capable d'exploiter la dépendance temporelle des mesures résolues en temps, qui fournissent plus d'informations tout autre type de mesure en TOD. Des résultats illustrant la reconstruction d'un milieu complexe comme une souris sont présentés pour démontrer le potentiel de notre approche.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Le cancer pulmonaire est la principale cause de décès parmi tous les cancers au Canada. Le pronostic est généralement faible, de l'ordre de 15% de taux de survie après 5 ans. Les déplacements internes des structures anatomiques apportent une incertitude sur la précision des traitements en radio-oncologie, ce qui diminue leur efficacité. Dans cette optique, certaines techniques comme la radio-chirurgie et la radiothérapie par modulation de l'intensité (IMRT) visent à améliorer les résultats cliniques en ciblant davantage la tumeur. Ceci permet d'augmenter la dose reçue par les tissus cancéreux et de réduire celle administrée aux tissus sains avoisinants. Ce projet vise à mieux évaluer la dose réelle reçue pendant un traitement considérant une anatomie en mouvement. Pour ce faire, des plans de CyberKnife et d'IMRT sont recalculés en utilisant un algorithme Monte Carlo 4D de transport de particules qui permet d'effectuer de l'accumulation de dose dans une géométrie déformable. Un environnement de simulation a été développé afin de modéliser ces deux modalités pour comparer les distributions de doses standard et 4D. Les déformations dans le patient sont obtenues en utilisant un algorithme de recalage déformable d'image (DIR) entre les différentes phases respiratoire générées par le scan CT 4D. Ceci permet de conserver une correspondance de voxels à voxels entre la géométrie de référence et celles déformées. La DIR est calculée en utilisant la suite ANTs («Advanced Normalization Tools») et est basée sur des difféomorphismes. Une version modifiée de DOSXYZnrc de la suite EGSnrc, defDOSXYZnrc, est utilisée pour le transport de particule en 4D. Les résultats sont comparés à une planification standard afin de valider le modèle actuel qui constitue une approximation par rapport à une vraie accumulation de dose en 4D.