963 resultados para 3D object manipulation
Resumo:
A persistent issue of debate in the area of 3D object recognition concerns the nature of the experientially acquired object models in the primate visual system. One prominent proposal in this regard has expounded the use of object centered models, such as representations of the objects' 3D structures in a coordinate frame independent of the viewing parameters [Marr and Nishihara, 1978]. In contrast to this is another proposal which suggests that the viewing parameters encountered during the learning phase might be inextricably linked to subsequent performance on a recognition task [Tarr and Pinker, 1989; Poggio and Edelman, 1990]. The 'object model', according to this idea, is simply a collection of the sample views encountered during training. Given that object centered recognition strategies have the attractive feature of leading to viewpoint independence, they have garnered much of the research effort in the field of computational vision. Furthermore, since human recognition performance seems remarkably robust in the face of imaging variations [Ellis et al., 1989], it has often been implicitly assumed that the visual system employs an object centered strategy. In the present study we examine this assumption more closely. Our experimental results with a class of novel 3D structures strongly suggest the use of a view-based strategy by the human visual system even when it has the opportunity of constructing and using object-centered models. In fact, for our chosen class of objects, the results seem to support a stronger claim: 3D object recognition is 2D view-based.
Resumo:
Many 3D objects in the world around us are strongly constrained. For instance, not only cultural artifacts but also many natural objects are bilaterally symmetric. Thoretical arguments suggest and psychophysical experiments confirm that humans may be better in the recognition of symmetric objects. The hypothesis of symmetry-induced virtual views together with a network model that successfully accounts for human recognition of generic 3D objects leads to predictions that we have verified with psychophysical experiments.
Resumo:
Tabletop computers featuring multi-touch input and object tracking are a common platform for research on Tangible User Interfaces (also known as Tangible Interaction). However, such systems are confined to sensing activity on the tabletop surface, disregarding the rich and relatively unexplored interaction canvas above the tabletop. This dissertation contributes with tCAD, a 3D modeling tool combining fiducial marker tracking, finger tracking and depth sensing in a single system. This dissertation presents the technical details of how these features were integrated, attesting to its viability through the design, development and early evaluation of the tCAD application. A key aspect of this work is a description of the interaction techniques enabled by merging tracked objects with direct user input on and above a table surface.
Resumo:
Somatosensory object discrimination has been shown to involve widespread cortical and subcortical structures in both cerebral hemispheres. In this study we aimed to identify the networks involved in tactile object manipulation by principal component analysis (PCA) of individual subjects. We expected to find more than one network.
Resumo:
This article presents a novel system and a control strategy for visual following of a 3D moving object by an Unmanned Aerial Vehicle UAV. The presented strategy is based only on the visual information given by an adaptive tracking method based on the color information, which jointly with the dynamics of a camera fixed to a rotary wind UAV are used to develop an Image-based visual servoing IBVS system. This system is focused on continuously following a 3D moving target object, maintaining it with a fixed distance and centered on the image plane. The algorithm is validated on real flights on outdoors scenarios, showing the robustness of the proposed systems against winds perturbations, illumination and weather changes among others. The obtained results indicate that the proposed algorithms is suitable for complex controls task, such object following and pursuit, flying in formation, as well as their use for indoor navigation
Resumo:
Multi-camera 3D tracking systems with overlapping cameras represent a powerful mean for scene analysis, as they potentially allow greater robustness than monocular systems and provide useful 3D information about object location and movement. However, their performance relies on accurately calibrated camera networks, which is not a realistic assumption in real surveillance environments. Here, we introduce a multi-camera system for tracking the 3D position of a varying number of objects and simultaneously refin-ing the calibration of the network of overlapping cameras. Therefore, we introduce a Bayesian framework that combines Particle Filtering for tracking with recursive Bayesian estimation methods by means of adapted transdimensional MCMC sampling. Addi-tionally, the system has been designed to work on simple motion detection masks, making it suitable for camera networks with low transmission capabilities. Tests show that our approach allows a successful performance even when starting from clearly inaccurate camera calibrations, which would ruin conventional approaches.
Resumo:
In this project, we propose the implementation of a 3D object recognition system which will be optimized to operate under demanding time constraints. The system must be robust so that objects can be recognized properly in poor light conditions and cluttered scenes with significant levels of occlusion. An important requirement must be met: the system must exhibit a reasonable performance running on a low power consumption mobile GPU computing platform (NVIDIA Jetson TK1) so that it can be integrated in mobile robotics systems, ambient intelligence or ambient assisted living applications. The acquisition system is based on the use of color and depth (RGB-D) data streams provided by low-cost 3D sensors like Microsoft Kinect or PrimeSense Carmine. The range of algorithms and applications to be implemented and integrated will be quite broad, ranging from the acquisition, outlier removal or filtering of the input data and the segmentation or characterization of regions of interest in the scene to the very object recognition and pose estimation. Furthermore, in order to validate the proposed system, we will create a 3D object dataset. It will be composed by a set of 3D models, reconstructed from common household objects, as well as a handful of test scenes in which those objects appear. The scenes will be characterized by different levels of occlusion, diverse distances from the elements to the sensor and variations on the pose of the target objects. The creation of this dataset implies the additional development of 3D data acquisition and 3D object reconstruction applications. The resulting system has many possible applications, ranging from mobile robot navigation and semantic scene labeling to human-computer interaction (HCI) systems based on visual information.
Resumo:
There is evidence for the late development in humans of configural face and animal recognition. We show that the recognition of artificial three-dimensional (3D) objects from part configurations develops similarly late. We also demonstrate that the cross-modal integration of object information reinforces the development of configural recognition more than the intra-modal integration does. Multimodal object representations in the brain may therefore play a role in configural object recognition. © 2003 Elsevier B.V. All rights reserved.
Resumo:
This work presents the design of a real-time system to model visual objects with the use of self-organising networks. The architecture of the system addresses multiple computer vision tasks such as image segmentation, optimal parameter estimation and object representation. We first develop a framework for building non-rigid shapes using the growth mechanism of the self-organising maps, and then we define an optimal number of nodes without overfitting or underfitting the network based on the knowledge obtained from information-theoretic considerations. We present experimental results for hands and faces, and we quantitatively evaluate the matching capabilities of the proposed method with the topographic product. The proposed method is easily extensible to 3D objects, as it offers similar features for efficient mesh reconstruction.
Resumo:
During grasping and intelligent robotic manipulation tasks, the camera position relative to the scene changes dramatically because the robot is moving to adapt its path and correctly grasp objects. This is because the camera is mounted at the robot effector. For this reason, in this type of environment, a visual recognition system must be implemented to recognize and “automatically and autonomously” obtain the positions of objects in the scene. Furthermore, in industrial environments, all objects that are manipulated by robots are made of the same material and cannot be differentiated by features such as texture or color. In this work, first, a study and analysis of 3D recognition descriptors has been completed for application in these environments. Second, a visual recognition system designed from specific distributed client-server architecture has been proposed to be applied in the recognition process of industrial objects without these appearance features. Our system has been implemented to overcome problems of recognition when the objects can only be recognized by geometric shape and the simplicity of shapes could create ambiguity. Finally, some real tests are performed and illustrated to verify the satisfactory performance of the proposed system.
Resumo:
Sensing techniques are important for solving problems of uncertainty inherent to intelligent grasping tasks. The main goal here is to present a visual sensing system based on range imaging technology for robot manipulation of non-rigid objects. Our proposal provides a suitable visual perception system of complex grasping tasks to support a robot controller when other sensor systems, such as tactile and force, are not able to obtain useful data relevant to the grasping manipulation task. In particular, a new visual approach based on RGBD data was implemented to help a robot controller carry out intelligent manipulation tasks with flexible objects. The proposed method supervises the interaction between the grasped object and the robot hand in order to avoid poor contact between the fingertips and an object when there is neither force nor pressure data. This new approach is also used to measure changes to the shape of an object’s surfaces and so allows us to find deformations caused by inappropriate pressure being applied by the hand’s fingers. Test was carried out for grasping tasks involving several flexible household objects with a multi-fingered robot hand working in real time. Our approach generates pulses from the deformation detection method and sends an event message to the robot controller when surface deformation is detected. In comparison with other methods, the obtained results reveal that our visual pipeline does not use deformations models of objects and materials, as well as the approach works well both planar and 3D household objects in real time. In addition, our method does not depend on the pose of the robot hand because the location of the reference system is computed from a recognition process of a pattern located place at the robot forearm. The presented experiments demonstrate that the proposed method accomplishes a good monitoring of grasping task with several objects and different grasping configurations in indoor environments.
Resumo:
Nowadays, existing 3D scanning cameras and microscopes in the market use digital or discrete sensors, such as CCDs or CMOS for object detection applications. However, these combined systems are not fast enough for some application scenarios since they require large data processing resources and can be cumbersome. Thereby, there is a clear interest in exploring the possibilities and performances of analogue sensors such as arrays of position sensitive detectors with the final goal of integrating them in 3D scanning cameras or microscopes for object detection purposes. The work performed in this thesis deals with the implementation of prototype systems in order to explore the application of object detection using amorphous silicon position sensors of 32 and 128 lines which were produced in the clean room at CENIMAT-CEMOP. During the first phase of this work, the fabrication and the study of the static and dynamic specifications of the sensors as well as their conditioning in relation to the existing scientific and technological knowledge became a starting point. Subsequently, relevant data acquisition and suitable signal processing electronics were assembled. Various prototypes were developed for the 32 and 128 array PSD sensors. Appropriate optical solutions were integrated to work together with the constructed prototypes, allowing the required experiments to be carried out and allowing the achievement of the results presented in this thesis. All control, data acquisition and 3D rendering platform software was implemented for the existing systems. All these components were combined together to form several integrated systems for the 32 and 128 line PSD 3D sensors. The performance of the 32 PSD array sensor and system was evaluated for machine vision applications such as for example 3D object rendering as well as for microscopy applications such as for example micro object movement detection. Trials were also performed involving the 128 array PSD sensor systems. Sensor channel non-linearities of approximately 4 to 7% were obtained. Overall results obtained show the possibility of using a linear array of 32/128 1D line sensors based on the amorphous silicon technology to render 3D profiles of objects. The system and setup presented allows 3D rendering at high speeds and at high frame rates. The minimum detail or gap that can be detected by the sensor system is approximately 350 μm when using this current setup. It is also possible to render an object in 3D within a scanning angle range of 15º to 85º and identify its real height as a function of the scanning angle and the image displacement distance on the sensor. Simple and not so simple objects, such as a rubber and a plastic fork, can be rendered in 3D properly and accurately also at high resolution, using this sensor and system platform. The nip structure sensor system can detect primary and even derived colors of objects by a proper adjustment of the integration time of the system and by combining white, red, green and blue (RGB) light sources. A mean colorimetric error of 25.7 was obtained. It is also possible to detect the movement of micrometer objects using the 32 PSD sensor system. This kind of setup offers the possibility to detect if a micro object is moving, what are its dimensions and what is its position in two dimensions, even at high speeds. Results show a non-linearity of about 3% and a spatial resolution of < 2µm.
Resumo:
According to much evidence, observing objects activates two types of information: structural properties, i.e., the visual information about the structural features of objects, and function knowledge, i.e., the conceptual information about their skilful use. Many studies so far have focused on the role played by these two kinds of information during object recognition and on their neural underpinnings. However, to the best of our knowledge no study so far has focused on the different activation of this information (structural vs. function) during object manipulation and conceptualization, depending on the age of participants and on the level of object familiarity (familiar vs. non-familiar). Therefore, the main aim of this dissertation was to investigate how actions and concepts related to familiar and non-familiar objects may vary across development. To pursue this aim, four studies were carried out. A first study led to the creation of the Familiar and Non-Familiar Stimuli Database, a set of everyday objects classified by Italian pre-schoolers, schoolers, and adults, useful to verify how object knowledge is modulated by age and frequency of use. A parallel study demonstrated that factors such as sociocultural dynamics may affect the perception of objects. Specifically, data for familiarity, naming, function, using and frequency of use of the objects used to create the Familiar And Non-Familiar Stimuli Database were collected with Dutch and Croatian children and adults. The last two studies on object interaction and language provide further evidence in support of the literature on affordances and on the link between affordances and the cognitive process of language from a developmental point of view, supporting the perspective of a situated cognition and emphasizing the crucial role of human experience.
Resumo:
X-ray is a technology that is used for numerous applications in the medical field. The process of X-ray projection gives a 2-dimension (2D) grey-level texture from a 3- dimension (3D) object. Until now no clear demonstration or correlation has positioned the 2D texture analysis as a valid indirect evaluation of the 3D microarchitecture. TBS is a new texture parameter based on the measure of the experimental variogram. TBS evaluates the variation between 2D image grey-levels. The aim of this study was to evaluate existing correlations between 3D bone microarchitecture parameters - evaluated from μCT reconstructions - and the TBS value, calculated on 2D projected images. 30 dried human cadaveric vertebrae were acquired on a micro-scanner (eXplorer Locus, GE) at isotropic resolution of 93 μm. 3D vertebral body models were used. The following 3D microarchitecture parameters were used: Bone volume fraction (BV/TV), Trabecular thickness (TbTh), trabecular space (TbSp), trabecular number (TbN) and connectivity density (ConnD). 3D/2D projections has been done by taking into account the Beer-Lambert Law at X-ray energy of 50, 100, 150 KeV. TBS was assessed on 2D projected images. Correlations between TBS and the 3D microarchitecture parameters were evaluated using a linear regression analysis. Paired T-test is used to assess the X-ray energy effects on TBS. Multiple linear regressions (backward) were used to evaluate relationships between TBS and 3D microarchitecture parameters using a bootstrap process. BV/TV of the sample ranged from 18.5 to 37.6% with an average value at 28.8%. Correlations' analysis showedthat TBSwere strongly correlatedwith ConnD(0.856≤r≤0.862; p<0.001),with TbN (0.805≤r≤0.810; p<0.001) and negatively with TbSp (−0.714≤r≤−0.726; p<0.001), regardless X-ray energy. Results show that lower TBS values are related to "degraded" microarchitecture, with low ConnD, low TbN and a high TbSp. The opposite is also true. X-ray energy has no effect onTBS neither on the correlations betweenTBS and the 3Dmicroarchitecture parameters. In this study, we demonstrated that TBS was significantly correlated with 3D microarchitecture parameters ConnD and TbN, and negatively with TbSp, no matter what X-ray energy has been used. This article is part of a Special Issue entitled ECTS 2011. Disclosure of interest: None declared.
Resumo:
We develop a method for obtaining 3D polarimetric integral images from elemental images recorded in low light illumination conditions. Since photon-counting images are very sparse, calculation of the Stokes parameters and the degree of polarization should be handled carefully. In our approach, polarimetric 3D integral images are generated using the Maximum Likelihood Estimation and subsequently reconstructed by means of a Total Variation Denoising filter. In this way, polarimetric results are comparable to those obtained in conventional illumination conditions. We also show that polarimetric information retrieved from photon starved images can be used in 3D object recognition problems. To the best of our knowledge, this is the first report on 3D polarimetric photon counting integral imaging.