24 resultados para 3D point
Resumo:
3D sensors provides valuable information for mobile robotic tasks like scene classification or object recognition, but these sensors often produce noisy data that makes impossible applying classical keypoint detection and feature extraction techniques. Therefore, noise removal and downsampling have become essential steps in 3D data processing. In this work, we propose the use of a 3D filtering and down-sampling technique based on a Growing Neural Gas (GNG) network. GNG method is able to deal with outliers presents in the input data. These features allows to represent 3D spaces, obtaining an induced Delaunay Triangulation of the input space. Experiments show how the state-of-the-art keypoint detectors improve their performance using GNG output representation as input data. Descriptors extracted on improved keypoints perform better matching in robotics applications as 3D scene registration.
Resumo:
Objective: To evaluate two cases of intermittent exotropia (IX(T)) treated by vision therapy the efficacy of the treatment by complementing the clinical examination with a 3-D video-oculography to register and to evidence the potential applicability of this technology for such purpose. Methods: We report the binocular alignment changes occurring after vision therapy in a woman of 36 years with an IX(T) of 25 prism diopters (Δ) at far and 18 Δ at near and a child of 10 years with 8 Δ of IX(T) in primary position associated to 6 Δ of left eye hypotropia. Both patients presented good visual acuity with correction in both eyes. Instability of ocular deviation was evident by VOG analysis, revealing also the presence of vertical and torsional components. Binocular vision therapy was prescribed and performed including different types of vergence, accommodation, and consciousness of diplopia training. Results: After therapy, excellent ranges of fusional vergence and a “to-the-nose” near point of convergence were obtained. The 3-D VOG examination (Sensoro Motoric Instruments, Teltow, Germany) confirmed the compensation of the deviation with a high level of stability of binocular alignment. Significant improvement could be observed after therapy in the vertical and torsional components that were found to become more stable. Patients were very satisfied with the outcome obtained by vision therapy. Conclusion: 3D-VOG is a useful technique for providing an objective register of the compensation of the ocular deviation and the stability of the binocular alignment achieved after vision therapy in cases of IX(T), providing a detailed analysis of vertical and torsional improvements.
Resumo:
Objetivo: Evaluar la eficacia del tratamiento en 3 casos de exotropia intermitente (XT(i)) mediante ejercicios de terapia visual, completando la exploración clínica con Videooculografia-30 y evidenciar la potencial aplicabilidad de esta tecnología para dicho propósito. Métodos: Exponemos los cambios ocurridos tras ejercicios de terapia visual en una mujer de 36 años con XT(i) de -25 dioptrías prismáticas (dp) de lejos y 18 dp de cerca; Un niño de 10 años de edad con 8 dp de XT(i) en posición primaria, asociados a +6 dp de hipotropia izquierda; y un hombre de 63 años con XT(i) de 6 dp en posición primaria asociada a +7 dp de hipertropia derecha. Todos los pacientes presentaron buena agudeza visual corregida en ambos ojos. La inestabilidad de la desviación ocular se evidenció mediante análisis de VOG-30, revelando la presencia de components verticales y torsionales. Se realizaron ejercicios de terapia visual, incluyendo diferentes tipos de ejercicios de vergencias, acomodación y percepción de la diplopía. Resultados: Tras la terapia visual se obtuvieron excelentes rangos de vergencias fusionales y de punto próximo de convergencia («hasta la nariz»). El examen mediante VOG-3D (Sensoro Motoric lnstruments, Teltow, Germany) confirmó la compensación de la desviación con estabilidad del alineamiento ocular. Se observó una significativa mejora después de la terapia en los components verticals y torsionales, lo cuales se hicieron más estables. Los pacientes se mostraron muy satisfechos de los resultados obtenidos. Conclusión: La VOG-3D es una técnica útil para dotamos de un método objetivo de registro de la compensación y estabilidad de la desviación ocular después de realizar ejercicios de terapia visual en casos de XT(i), ofreciéndonos un detallado análisis de la mejoría de los components verticales y torsionales.
Resumo:
In this work, we propose the use of the neural gas (NG), a neural network that uses an unsupervised Competitive Hebbian Learning (CHL) rule, to develop a reverse engineering process. This is a simple and accurate method to reconstruct objects from point clouds obtained from multiple overlapping views using low-cost sensors. In contrast to other methods that may need several stages that include downsampling, noise filtering and many other tasks, the NG automatically obtains the 3D model of the scanned objects. To demonstrate the validity of our proposal we tested our method with several models and performed a study of the neural network parameterization computing the quality of representation and also comparing results with other neural methods like growing neural gas and Kohonen maps or classical methods like Voxel Grid. We also reconstructed models acquired by low cost sensors that can be used in virtual and augmented reality environments for redesign or manipulation purposes. Since the NG algorithm has a strong computational cost we propose its acceleration. We have redesigned and implemented the NG learning algorithm to fit it onto Graphics Processing Units using CUDA. A speed-up of 180× faster is obtained compared to the sequential CPU version.
Resumo:
Today, the requirement of professional skills to university students is constantly increasing in our society. In our opinion, the content offered in official degrees need to be nourished with different variables, enriching their global professional knowledge in a parallel way; that is why, in recent years, there is a great multiplicity of complementary courses at university. One of the most socially demanded technical requirements within the architectural, design or engineering field is the management of 3D drawing software, becoming an indispensable reality in these sectors. Thus, this specific training becomes essential over two-dimension traditional design, because the inclusion of great possibilities of spatial development that go beyond conventional orthographic projections (plans, sections or elevations), allowing modelling and rotation of the selected items from multiple angles and perspectives. Therefore, this paper analyzes the teaching methodology of a complementary course for those technicians in the construction industry interested in computer-aided design, using modelling (SketchupMake) and rendering programs (Kerkythea). The course is developed from the technician point of view, by learning computer management and its application to professional development from a more general to a more specific view through practical examples. The proposed methodology is based on the development of real examples in different professional environments such as rehabilitation, new constructions, opening projects or architectural design. This multidisciplinary contribution improves criticism of students in different areas, encouraging new learning strategies and the independent development of three-dimensional solutions. Thus, the practical implementation of new situations, even suggested by the students themselves, ensures active participation, saving time during the design process and the increase of effectiveness when generating elements which may be represented, moved or virtually tested. In conclusion, this teaching-learning methodology improves the skills and competencies of students to face the growing professional demands of society. After finishing the course, technicians not only improved their expertise in the field of drawing but they also enhanced their capacity for spatial vision; both essential qualities in these sectors that can be applied to their professional development with great success.
Resumo:
Durante los últimos años ha sido creciente el uso de las unidades de procesamiento gráfico, más conocidas como GPU (Graphic Processing Unit), en aplicaciones de propósito general, dejando a un lado el objetivo para el que fueron creadas y que no era otro que el renderizado de gráficos por computador. Este crecimiento se debe en parte a la evolución que han experimentado estos dispositivos durante este tiempo y que les ha dotado de gran potencia de cálculo, consiguiendo que su uso se extienda desde ordenadores personales a grandes cluster. Este hecho unido a la proliferación de sensores RGB-D de bajo coste ha hecho que crezca el número de aplicaciones de visión que hacen uso de esta tecnología para la resolución de problemas, así como también para el desarrollo de nuevas aplicaciones. Todas estas mejoras no solamente se han realizado en la parte hardware, es decir en los dispositivos, sino también en la parte software con la aparición de nuevas herramientas de desarrollo que facilitan la programación de estos dispositivos GPU. Este nuevo paradigma se acuñó como Computación de Propósito General sobre Unidades de Proceso Gráfico (General-Purpose computation on Graphics Processing Units, GPGPU). Los dispositivos GPU se clasifican en diferentes familias, en función de las distintas características hardware que poseen. Cada nueva familia que aparece incorpora nuevas mejoras tecnológicas que le permite conseguir mejor rendimiento que las anteriores. No obstante, para sacar un rendimiento óptimo a un dispositivo GPU es necesario configurarlo correctamente antes de usarlo. Esta configuración viene determinada por los valores asignados a una serie de parámetros del dispositivo. Por tanto, muchas de las implementaciones que hoy en día hacen uso de los dispositivos GPU para el registro denso de nubes de puntos 3D, podrían ver mejorado su rendimiento con una configuración óptima de dichos parámetros, en función del dispositivo utilizado. Es por ello que, ante la falta de un estudio detallado del grado de afectación de los parámetros GPU sobre el rendimiento final de una implementación, se consideró muy conveniente la realización de este estudio. Este estudio no sólo se realizó con distintas configuraciones de parámetros GPU, sino también con diferentes arquitecturas de dispositivos GPU. El objetivo de este estudio es proporcionar una herramienta de decisión que ayude a los desarrolladores a la hora implementar aplicaciones para dispositivos GPU. Uno de los campos de investigación en los que más prolifera el uso de estas tecnologías es el campo de la robótica ya que tradicionalmente en robótica, sobre todo en la robótica móvil, se utilizaban combinaciones de sensores de distinta naturaleza con un alto coste económico, como el láser, el sónar o el sensor de contacto, para obtener datos del entorno. Más tarde, estos datos eran utilizados en aplicaciones de visión por computador con un coste computacional muy alto. Todo este coste, tanto el económico de los sensores utilizados como el coste computacional, se ha visto reducido notablemente gracias a estas nuevas tecnologías. Dentro de las aplicaciones de visión por computador más utilizadas está el registro de nubes de puntos. Este proceso es, en general, la transformación de diferentes nubes de puntos a un sistema de coordenadas conocido. Los datos pueden proceder de fotografías, de diferentes sensores, etc. Se utiliza en diferentes campos como son la visión artificial, la imagen médica, el reconocimiento de objetos y el análisis de imágenes y datos de satélites. El registro se utiliza para poder comparar o integrar los datos obtenidos en diferentes mediciones. En este trabajo se realiza un repaso del estado del arte de los métodos de registro 3D. Al mismo tiempo, se presenta un profundo estudio sobre el método de registro 3D más utilizado, Iterative Closest Point (ICP), y una de sus variantes más conocidas, Expectation-Maximization ICP (EMICP). Este estudio contempla tanto su implementación secuencial como su implementación paralela en dispositivos GPU, centrándose en cómo afectan a su rendimiento las distintas configuraciones de parámetros GPU. Como consecuencia de este estudio, también se presenta una propuesta para mejorar el aprovechamiento de la memoria de los dispositivos GPU, permitiendo el trabajo con nubes de puntos más grandes, reduciendo el problema de la limitación de memoria impuesta por el dispositivo. El funcionamiento de los métodos de registro 3D utilizados en este trabajo depende en gran medida de la inicialización del problema. En este caso, esa inicialización del problema consiste en la correcta elección de la matriz de transformación con la que se iniciará el algoritmo. Debido a que este aspecto es muy importante en este tipo de algoritmos, ya que de él depende llegar antes o no a la solución o, incluso, no llegar nunca a la solución, en este trabajo se presenta un estudio sobre el espacio de transformaciones con el objetivo de caracterizarlo y facilitar la elección de la transformación inicial a utilizar en estos algoritmos.
Resumo:
Since the beginning of 3D computer vision problems, the use of techniques to reduce the data to make it treatable preserving the important aspects of the scene has been necessary. Currently, with the new low-cost RGB-D sensors, which provide a stream of color and 3D data of approximately 30 frames per second, this is getting more relevance. Many applications make use of these sensors and need a preprocessing to downsample the data in order to either reduce the processing time or improve the data (e.g., reducing noise or enhancing the important features). In this paper, we present a comparison of different downsampling techniques which are based on different principles. Concretely, five different downsampling methods are included: a bilinear-based method, a normal-based, a color-based, a combination of the normal and color-based samplings, and a growing neural gas (GNG)-based approach. For the comparison, two different models have been used acquired with the Blensor software. Moreover, to evaluate the effect of the downsampling in a real application, a 3D non-rigid registration is performed with the data sampled. From the experimentation we can conclude that depending on the purpose of the application some kernels of the sampling methods can improve drastically the results. Bilinear- and GNG-based methods provide homogeneous point clouds, but color-based and normal-based provide datasets with higher density of points in areas with specific features. In the non-rigid application, if a color-based sampled point cloud is used, it is possible to properly register two datasets for cases where intensity data are relevant in the model and outperform the results if only a homogeneous sampling is used.
Resumo:
The research described in this thesis was motivated by the need of a robust model capable of representing 3D data obtained with 3D sensors, which are inherently noisy. In addition, time constraints have to be considered as these sensors are capable of providing a 3D data stream in real time. This thesis proposed the use of Self-Organizing Maps (SOMs) as a 3D representation model. In particular, we proposed the use of the Growing Neural Gas (GNG) network, which has been successfully used for clustering, pattern recognition and topology representation of multi-dimensional data. Until now, Self-Organizing Maps have been primarily computed offline and their application in 3D data has mainly focused on free noise models, without considering time constraints. It is proposed a hardware implementation leveraging the computing power of modern GPUs, which takes advantage of a new paradigm coined as General-Purpose Computing on Graphics Processing Units (GPGPU). The proposed methods were applied to different problem and applications in the area of computer vision such as the recognition and localization of objects, visual surveillance or 3D reconstruction.
Resumo:
Nowadays, new computers generation provides a high performance that enables to build computationally expensive computer vision applications applied to mobile robotics. Building a map of the environment is a common task of a robot and is an essential part to allow the robots to move through these environments. Traditionally, mobile robots used a combination of several sensors from different technologies. Lasers, sonars and contact sensors have been typically used in any mobile robotic architecture, however color cameras are an important sensor due to we want the robots to use the same information that humans to sense and move through the different environments. Color cameras are cheap and flexible but a lot of work need to be done to give robots enough visual understanding of the scenes. Computer vision algorithms are computational complex problems but nowadays robots have access to different and powerful architectures that can be used for mobile robotics purposes. The advent of low-cost RGB-D sensors like Microsoft Kinect which provide 3D colored point clouds at high frame rates made the computer vision even more relevant in the mobile robotics field. The combination of visual and 3D data allows the systems to use both computer vision and 3D processing and therefore to be aware of more details of the surrounding environment. The research described in this thesis was motivated by the need of scene mapping. Being aware of the surrounding environment is a key feature in many mobile robotics applications from simple robotic navigation to complex surveillance applications. In addition, the acquisition of a 3D model of the scenes is useful in many areas as video games scene modeling where well-known places are reconstructed and added to game systems or advertising where once you get the 3D model of one room the system can add furniture pieces using augmented reality techniques. In this thesis we perform an experimental study of the state-of-the-art registration methods to find which one fits better to our scene mapping purposes. Different methods are tested and analyzed on different scene distributions of visual and geometry appearance. In addition, this thesis proposes two methods for 3d data compression and representation of 3D maps. Our 3D representation proposal is based on the use of Growing Neural Gas (GNG) method. This Self-Organizing Maps (SOMs) has been successfully used for clustering, pattern recognition and topology representation of various kind of data. Until now, Self-Organizing Maps have been primarily computed offline and their application in 3D data has mainly focused on free noise models without considering time constraints. Self-organising neural models have the ability to provide a good representation of the input space. In particular, the Growing Neural Gas (GNG) is a suitable model because of its flexibility, rapid adaptation and excellent quality of representation. However, this type of learning is time consuming, specially for high-dimensional input data. Since real applications often work under time constraints, it is necessary to adapt the learning process in order to complete it in a predefined time. This thesis proposes a hardware implementation leveraging the computing power of modern GPUs which takes advantage of a new paradigm coined as General-Purpose Computing on Graphics Processing Units (GPGPU). Our proposed geometrical 3D compression method seeks to reduce the 3D information using plane detection as basic structure to compress the data. This is due to our target environments are man-made and therefore there are a lot of points that belong to a plane surface. Our proposed method is able to get good compression results in those man-made scenarios. The detected and compressed planes can be also used in other applications as surface reconstruction or plane-based registration algorithms. Finally, we have also demonstrated the goodness of the GPU technologies getting a high performance implementation of a CAD/CAM common technique called Virtual Digitizing.