956 resultados para 3D object recognition
Resumo:
In questa tesi sono stati analizzati alcuni metodi di ricerca per dati 3D. Viene illustrata una panoramica generale sul campo della Computer Vision, sullo stato dell’arte dei sensori per l’acquisizione e su alcuni dei formati utilizzati per la descrizione di dati 3D. In seguito è stato fatto un approfondimento sulla 3D Object Recognition dove, oltre ad essere descritto l’intero processo di matching tra Local Features, è stata fatta una focalizzazione sulla fase di detection dei punti salienti. In particolare è stato analizzato un Learned Keypoint detector, basato su tecniche di apprendimento di machine learning. Quest ultimo viene illustrato con l’implementazione di due algoritmi di ricerca di vicini: uno esauriente (K-d tree) e uno approssimato (Radial Search). Sono state riportate infine alcune valutazioni sperimentali in termini di efficienza e velocità del detector implementato con diversi metodi di ricerca, mostrando l’effettivo miglioramento di performance senza una considerabile perdita di accuratezza con la ricerca approssimata.
Resumo:
The suitable operation of mobile robots when providing Ambient Assisted Living (AAL) services calls for robust object recognition capabilities. Probabilistic Graphical Models (PGMs) have become the de-facto choice in recognition systems aiming to e ciently exploit contextual relations among objects, also dealing with the uncertainty inherent to the robot workspace. However, these models can perform in an inco herent way when operating in a long-term fashion out of the laboratory, e.g. while recognizing objects in peculiar con gurations or belonging to new types. In this work we propose a recognition system that resorts to PGMs and common-sense knowledge, represented in the form of an ontology, to detect those inconsistencies and learn from them. The utilization of the ontology carries additional advantages, e.g. the possibility to verbalize the robot's knowledge. A primary demonstration of the system capabilities has been carried out with very promising results.
Resumo:
We present a video-based system which interactively captures the geometry of a 3D object in the form of a point cloud, then recognizes and registers known objects in this point cloud in a matter of seconds (fig. 1). In order to achieve interactive speed, we exploit both efficient inference algorithms and parallel computation, often on a GPU. The system can be broken down into two distinct phases: geometry capture, and object inference. We now discuss these in further detail. © 2011 IEEE.
Resumo:
Nowadays, existing 3D scanning cameras and microscopes in the market use digital or discrete sensors, such as CCDs or CMOS for object detection applications. However, these combined systems are not fast enough for some application scenarios since they require large data processing resources and can be cumbersome. Thereby, there is a clear interest in exploring the possibilities and performances of analogue sensors such as arrays of position sensitive detectors with the final goal of integrating them in 3D scanning cameras or microscopes for object detection purposes. The work performed in this thesis deals with the implementation of prototype systems in order to explore the application of object detection using amorphous silicon position sensors of 32 and 128 lines which were produced in the clean room at CENIMAT-CEMOP. During the first phase of this work, the fabrication and the study of the static and dynamic specifications of the sensors as well as their conditioning in relation to the existing scientific and technological knowledge became a starting point. Subsequently, relevant data acquisition and suitable signal processing electronics were assembled. Various prototypes were developed for the 32 and 128 array PSD sensors. Appropriate optical solutions were integrated to work together with the constructed prototypes, allowing the required experiments to be carried out and allowing the achievement of the results presented in this thesis. All control, data acquisition and 3D rendering platform software was implemented for the existing systems. All these components were combined together to form several integrated systems for the 32 and 128 line PSD 3D sensors. The performance of the 32 PSD array sensor and system was evaluated for machine vision applications such as for example 3D object rendering as well as for microscopy applications such as for example micro object movement detection. Trials were also performed involving the 128 array PSD sensor systems. Sensor channel non-linearities of approximately 4 to 7% were obtained. Overall results obtained show the possibility of using a linear array of 32/128 1D line sensors based on the amorphous silicon technology to render 3D profiles of objects. The system and setup presented allows 3D rendering at high speeds and at high frame rates. The minimum detail or gap that can be detected by the sensor system is approximately 350 μm when using this current setup. It is also possible to render an object in 3D within a scanning angle range of 15º to 85º and identify its real height as a function of the scanning angle and the image displacement distance on the sensor. Simple and not so simple objects, such as a rubber and a plastic fork, can be rendered in 3D properly and accurately also at high resolution, using this sensor and system platform. The nip structure sensor system can detect primary and even derived colors of objects by a proper adjustment of the integration time of the system and by combining white, red, green and blue (RGB) light sources. A mean colorimetric error of 25.7 was obtained. It is also possible to detect the movement of micrometer objects using the 32 PSD sensor system. This kind of setup offers the possibility to detect if a micro object is moving, what are its dimensions and what is its position in two dimensions, even at high speeds. Results show a non-linearity of about 3% and a spatial resolution of < 2µm.
Resumo:
Most psychophysical studies of object recognition have focussed on the recognition and representation of individual objects subjects had previously explicitely been trained on. Correspondingly, modeling studies have often employed a 'grandmother'-type representation where the objects to be recognized were represented by individual units. However, objects in the natural world are commonly members of a class containing a number of visually similar objects, such as faces, for which physiology studies have provided support for a representation based on a sparse population code, which permits generalization from the learned exemplars to novel objects of that class. In this paper, we present results from psychophysical and modeling studies intended to investigate object recognition in natural ('continuous') object classes. In two experiments, subjects were trained to perform subordinate level discrimination in a continuous object class - images of computer-rendered cars - created using a 3D morphing system. By comparing the recognition performance of trained and untrained subjects we could estimate the effects of viewpoint-specific training and infer properties of the object class-specific representation learned as a result of training. We then compared the experimental findings to simulations, building on our recently presented HMAX model of object recognition in cortex, to investigate the computational properties of a population-based object class representation as outlined above. We find experimental evidence, supported by modeling results, that training builds a viewpoint- and class-specific representation that supplements a pre-existing repre-sentation with lower shape discriminability but possibly greater viewpoint invariance.
Resumo:
This paper describes a new method for reconstructing 3D surface points and a wireframe on the surface of a freeform object using a small number, e.g. 10, of 2D photographic images. The images are taken at different viewing directions by a perspective camera with full prior knowledge of the camera configurations. The reconstructed surface points are frontier points and the wireframe is a network of contour generators. Both of them are reconstructed by pairing apparent contours in the 2D images. Unlike previous works, we empirically demonstrate that if the viewing directions are uniformly distributed around the object's viewing sphere, then the reconstructed 3D points automatically cluster closely on a highly curved part of the surface and are widely spread on smooth or flat parts. The advantage of this property is that the reconstructed points along a surface or a contour generator are not under-sampled or under-represented because surfaces or contours should be sampled or represented with more densely points where their curvatures are high. The more complex the contour's shape, the greater is the number of points required, but the greater the number of points is automatically generated by the proposed method. Given that the viewing directions are uniformly distributed, the number and distribution of the reconstructed points depend on the shape or the curvature of the surface regardless of the size of the surface or the size of the object. The unique pattern of the reconstructed points and contours may be used in 31) object recognition and measurement without computationally intensive full surface reconstruction. The results are obtained from both computer-generated and real objects. (C) 2007 Elsevier B.V. All rights reserved.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Adult monkeys (Macaca mulatta) with lesions of the hippocampal formation, perirhinal cortex, areas TH/TF, as well as controls were tested on tasks of object, spatial and contextual recognition memory. ^ Using a visual paired-comparison (VPC) task, all experimental groups showed a lack of object recognition relative to controls, although this impairment emerged at 10 sec with perirhinal lesions, 30 sec with areas TH/TF lesions and 60 sec with hippocampal lesions. In contrast, only perirhinal lesions impaired performance on delayed nonmatching-to-sample (DNMS), another task of object recognition memory. All groups were tested on DNMS with distraction (dDNMS) to examine whether the use of active cognitive strategies during the delay period could enable good performance on DNMS in spite of impaired recognition memory (revealed by the VPC task). Distractors affected performance of animals with perirhinal lesions at the 10-sec delay (the only delay in which their DNMS performance was above chance). They did not affect performance of animals with areas TH/TF lesions. Hippocampectomized animals were impaired at the 600-sec delay (the only delay at which prevention of active strategies would likely affect their behavior). ^ While lesions of areas TH/TF impaired spatial location memory and object-in-place memory, hippocampal lesions impaired only object-in-place memory. The pattern of results for perirhinal cortex lesions on the different task conditions indicated that this cortical area is not critical for spatial memory. ^ Finally, all three lesions impaired contextual recognition memory processes. The pattern of impairment appeared to result from the formation of only a global representation of the object and background, and suggests that all three areas are recruited for associating information across sources. ^ These results support the view that (1) the perirhinal cortex maintains storage of information about object and the context in which it is learned for a brief period of time, (2) areas TH/TF maintain information about spatial location and form associations between objects and their spatial relationship (a process that likely requires additional time) and (3) the hippocampal formation mediates associations between objects, their spatial relationship and the general context in which these associations are formed (an integrative function that requires additional time). ^
Resumo:
New low cost sensors and open free libraries for 3D image processing are making important advances in robot vision applications possible, such as three-dimensional object recognition, semantic mapping, navigation and localization of robots, human detection and/or gesture recognition for human-machine interaction. In this paper, a novel method for recognizing and tracking the fingers of a human hand is presented. This method is based on point clouds from range images captured by a RGBD sensor. It works in real time and it does not require visual marks, camera calibration or previous knowledge of the environment. Moreover, it works successfully even when multiple objects appear in the scene or when the ambient light is changed. Furthermore, this method was designed to develop a human interface to control domestic or industrial devices, remotely. In this paper, the method was tested by operating a robotic hand. Firstly, the human hand was recognized and the fingers were detected. Secondly, the movement of the fingers was analysed and mapped to be imitated by a robotic hand.
Resumo:
3D sensors provides valuable information for mobile robotic tasks like scene classification or object recognition, but these sensors often produce noisy data that makes impossible applying classical keypoint detection and feature extraction techniques. Therefore, noise removal and downsampling have become essential steps in 3D data processing. In this work, we propose the use of a 3D filtering and down-sampling technique based on a Growing Neural Gas (GNG) network. GNG method is able to deal with outliers presents in the input data. These features allows to represent 3D spaces, obtaining an induced Delaunay Triangulation of the input space. Experiments show how the state-of-the-art keypoint detectors improve their performance using GNG output representation as input data. Descriptors extracted on improved keypoints perform better matching in robotics applications as 3D scene registration.
Resumo:
New low cost sensors and the new open free libraries for 3D image processing are permitting to achieve important advances for robot vision applications such as tridimensional object recognition, semantic mapping, navigation and localization of robots, human detection and/or gesture recognition for human-machine interaction. In this paper, a method to recognize the human hand and to track the fingers is proposed. This new method is based on point clouds from range images, RGBD. It does not require visual marks, camera calibration, environment knowledge and complex expensive acquisition systems. Furthermore, this method has been implemented to create a human interface in order to move a robot hand. The human hand is recognized and the movement of the fingers is analyzed. Afterwards, it is imitated from a Barret hand, using communication events programmed from ROS.
Resumo:
Beyond the inherent technical challenges, current research into the three dimensional surface correspondence problem is hampered by a lack of uniform terminology, an abundance of application specific algorithms, and the absence of a consistent model for comparing existing approaches and developing new ones. This paper addresses these challenges by presenting a framework for analysing, comparing, developing, and implementing surface correspondence algorithms. The framework uses five distinct stages to establish correspondence between surfaces. It is general, encompassing a wide variety of existing techniques, and flexible, facilitating the synthesis of new correspondence algorithms. This paper presents a review of existing surface correspondence algorithms, and shows how they fit into the correspondence framework. It also shows how the framework can be used to analyse and compare existing algorithms and develop new algorithms using the framework's modular structure. Six algorithms, four existing and two new, are implemented using the framework. Each implemented algorithm is used to match a number of surface pairs. Results demonstrate that the correspondence framework implementations are faithful implementations of existing algorithms, and that powerful new surface correspondence algorithms can be created. (C) 2004 Elsevier Inc. All rights reserved.
Resumo:
In the visual perception literature, the recognition of faces has often been contrasted with that of non-face objects, in terms of differences with regard to the role of parts, part relations and holistic processing. However, recent evidence from developmental studies has begun to blur this sharp distinction. We review evidence for a protracted development of object recognition that is reminiscent of the well-documented slow maturation observed for faces. The prolonged development manifests itself in a retarded processing of metric part relations as opposed to that of individual parts and offers surprising parallels to developmental accounts of face recognition, even though the interpretation of the data is less clear with regard to holistic processing. We conclude that such results might indicate functional commonalities between the mechanisms underlying the recognition of faces and non-face objects, which are modulated by different task requirements in the two stimulus domains.
Resumo:
Biometrics is afield of study which pursues the association of a person's identity with his/her physiological or behavioral characteristics.^ As one aspect of biometrics, face recognition has attracted special attention because it is a natural and noninvasive means to identify individuals. Most of the previous studies in face recognition are based on two-dimensional (2D) intensity images. Face recognition based on 2D intensity images, however, is sensitive to environment illumination and subject orientation changes, affecting the recognition results. With the development of three-dimensional (3D) scanners, 3D face recognition is being explored as an alternative to the traditional 2D methods for face recognition.^ This dissertation proposes a method in which the expression and the identity of a face are determined in an integrated fashion from 3D scans. In this framework, there is a front end expression recognition module which sorts the incoming 3D face according to the expression detected in the 3D scans. Then, scans with neutral expressions are processed by a corresponding 3D neutral face recognition module. Alternatively, if a scan displays a non-neutral expression, e.g., a smiling expression, it will be routed to an appropriate specialized recognition module for smiling face recognition.^ The expression recognition method proposed in this dissertation is innovative in that it uses information from 3D scans to perform the classification task. A smiling face recognition module was developed, based on the statistical modeling of the variance between faces with neutral expression and faces with a smiling expression.^ The proposed expression and face recognition framework was tested with a database containing 120 3D scans from 30 subjects (Half are neutral faces and half are smiling faces). It is shown that the proposed framework achieves a recognition rate 10% higher than attempting the identification with only the neutral face recognition module.^
Resumo:
X-ray is a technology that is used for numerous applications in the medical field. The process of X-ray projection gives a 2-dimension (2D) grey-level texture from a 3- dimension (3D) object. Until now no clear demonstration or correlation has positioned the 2D texture analysis as a valid indirect evaluation of the 3D microarchitecture. TBS is a new texture parameter based on the measure of the experimental variogram. TBS evaluates the variation between 2D image grey-levels. The aim of this study was to evaluate existing correlations between 3D bone microarchitecture parameters - evaluated from μCT reconstructions - and the TBS value, calculated on 2D projected images. 30 dried human cadaveric vertebrae were acquired on a micro-scanner (eXplorer Locus, GE) at isotropic resolution of 93 μm. 3D vertebral body models were used. The following 3D microarchitecture parameters were used: Bone volume fraction (BV/TV), Trabecular thickness (TbTh), trabecular space (TbSp), trabecular number (TbN) and connectivity density (ConnD). 3D/2D projections has been done by taking into account the Beer-Lambert Law at X-ray energy of 50, 100, 150 KeV. TBS was assessed on 2D projected images. Correlations between TBS and the 3D microarchitecture parameters were evaluated using a linear regression analysis. Paired T-test is used to assess the X-ray energy effects on TBS. Multiple linear regressions (backward) were used to evaluate relationships between TBS and 3D microarchitecture parameters using a bootstrap process. BV/TV of the sample ranged from 18.5 to 37.6% with an average value at 28.8%. Correlations' analysis showedthat TBSwere strongly correlatedwith ConnD(0.856≤r≤0.862; p<0.001),with TbN (0.805≤r≤0.810; p<0.001) and negatively with TbSp (−0.714≤r≤−0.726; p<0.001), regardless X-ray energy. Results show that lower TBS values are related to "degraded" microarchitecture, with low ConnD, low TbN and a high TbSp. The opposite is also true. X-ray energy has no effect onTBS neither on the correlations betweenTBS and the 3Dmicroarchitecture parameters. In this study, we demonstrated that TBS was significantly correlated with 3D microarchitecture parameters ConnD and TbN, and negatively with TbSp, no matter what X-ray energy has been used. This article is part of a Special Issue entitled ECTS 2011. Disclosure of interest: None declared.