989 resultados para 3D camera
Resumo:
New low cost sensors and open free libraries for 3D image processing are making important advances in robot vision applications possible, such as three-dimensional object recognition, semantic mapping, navigation and localization of robots, human detection and/or gesture recognition for human-machine interaction. In this paper, a novel method for recognizing and tracking the fingers of a human hand is presented. This method is based on point clouds from range images captured by a RGBD sensor. It works in real time and it does not require visual marks, camera calibration or previous knowledge of the environment. Moreover, it works successfully even when multiple objects appear in the scene or when the ambient light is changed. Furthermore, this method was designed to develop a human interface to control domestic or industrial devices, remotely. In this paper, the method was tested by operating a robotic hand. Firstly, the human hand was recognized and the fingers were detected. Secondly, the movement of the fingers was analysed and mapped to be imitated by a robotic hand.
Resumo:
The use of 3D data in mobile robotics provides valuable information about the robot’s environment. Traditionally, stereo cameras have been used as a low-cost 3D sensor. However, the lack of precision and texture for some surfaces suggests that the use of other 3D sensors could be more suitable. In this work, we examine the use of two sensors: an infrared SR4000 and a Kinect camera. We use a combination of 3D data obtained by these cameras, along with features obtained from 2D images acquired from these cameras, using a Growing Neural Gas (GNG) network applied to the 3D data. The goal is to obtain a robust egomotion technique. The GNG network is used to reduce the camera error. To calculate the egomotion, we test two methods for 3D registration. One is based on an iterative closest points algorithm, and the other employs random sample consensus. Finally, a simultaneous localization and mapping method is applied to the complete sequence to reduce the global error. The error from each sensor and the mapping results from the proposed method are examined.
Resumo:
This paper presents a method for fast calculation of the egomotion done by a robot using visual features. The method is part of a complete system for automatic map building and Simultaneous Localization and Mapping (SLAM). The method uses optical flow in order to determine if the robot has done a movement. If so, some visual features which do not accomplish several criteria (like intersection, unicity, etc,) are deleted, and then the egomotion is calculated. We use a state-of-the-art algorithm (TORO) in order to rectify the map and solve the SLAM problem. The proposed method provides better efficiency that other current methods.
Resumo:
During grasping and intelligent robotic manipulation tasks, the camera position relative to the scene changes dramatically because the robot is moving to adapt its path and correctly grasp objects. This is because the camera is mounted at the robot effector. For this reason, in this type of environment, a visual recognition system must be implemented to recognize and “automatically and autonomously” obtain the positions of objects in the scene. Furthermore, in industrial environments, all objects that are manipulated by robots are made of the same material and cannot be differentiated by features such as texture or color. In this work, first, a study and analysis of 3D recognition descriptors has been completed for application in these environments. Second, a visual recognition system designed from specific distributed client-server architecture has been proposed to be applied in the recognition process of industrial objects without these appearance features. Our system has been implemented to overcome problems of recognition when the objects can only be recognized by geometric shape and the simplicity of shapes could create ambiguity. Finally, some real tests are performed and illustrated to verify the satisfactory performance of the proposed system.
Resumo:
The ability to view and interact with 3D models has been happening for a long time. However, vision-based 3D modeling has only seen limited success in applications, as it faces many technical challenges. Hand-held mobile devices have changed the way we interact with virtual reality environments. Their high mobility and technical features, such as inertial sensors, cameras and fast processors, are especially attractive for advancing the state of the art in virtual reality systems. Also, their ubiquity and fast Internet connection open a path to distributed and collaborative development. However, such path has not been fully explored in many domains. VR systems for real world engineering contexts are still difficult to use, especially when geographically dispersed engineering teams need to collaboratively visualize and review 3D CAD models. Another challenge is the ability to rendering these environments at the required interactive rates and with high fidelity. In this document it is presented a virtual reality system mobile for visualization, navigation and reviewing large scale 3D CAD models, held under the CEDAR (Collaborative Engineering Design and Review) project. It’s focused on interaction using different navigation modes. The system uses the mobile device's inertial sensors and camera to allow users to navigate through large scale models. IT professionals, architects, civil engineers and oil industry experts were involved in a qualitative assessment of the CEDAR system, in the form of direct user interaction with the prototypes and audio-recorded interviews about the prototypes. The lessons learned are valuable and are presented on this document. Subsequently it was prepared a quantitative study on the different navigation modes to analyze the best mode to use it in a given situation.
Resumo:
Underwater video transects have become a common tool for quantitative analysis of the seafloor. However a major difficulty remains in the accurate determination of the area surveyed as underwater navigation can be unreliable and image scaling does not always compensate for distortions due to perspective and topography. Depending on the camera set-up and available instruments, different methods of surface measurement are applied, which make it difficult to compare data obtained by different vehicles. 3-D modelling of the seafloor based on 2-D video data and a reference scale can be used to compute subtransect dimensions. Focussing on the length of the subtransect, the data obtained from 3-D models created with the software PhotoModeler Scanner are compared with those determined from underwater acoustic positioning (ultra short baseline, USBL) and bottom tracking (Doppler velocity log, DVL). 3-D model building and scaling was successfully conducted on all three tested set-ups and the distortion of the reference scales due to substrate roughness was identified as the main source of imprecision. Acoustic positioning was generally inaccurate and bottom tracking unreliable on rough terrain. Subtransect lengths assessed with PhotoModeler were on average 20% longer than those derived from acoustic positioning due to the higher spatial resolution and the inclusion of slope. On a high relief wall bottom tracking and 3-D modelling yielded similar results. At present, 3-D modelling is the most powerful, albeit the most time-consuming, method for accurate determination of video subtransect dimensions.
Resumo:
This paper addresses the problem of obtaining 3d detailed reconstructions of human faces in real-time and with inexpensive hardware. We present an algorithm based on a monocular multi-spectral photometric-stereo setup. This system is known to capture high-detailed deforming 3d surfaces at high frame rates and without having to use any expensive hardware or synchronized light stage. However, the main challenge of such a setup is the calibration stage, which depends on the lights setup and how they interact with the specific material being captured, in this case, human faces. For this purpose we develop a self-calibration technique where the person being captured is asked to perform a rigid motion in front of the camera, maintaining a neutral expression. Rigidity constrains are then used to compute the head's motion with a structure-from-motion algorithm. Once the motion is obtained, a multi-view stereo algorithm reconstructs a coarse 3d model of the face. This coarse model is then used to estimate the lighting parameters with a stratified approach: In the first step we use a RANSAC search to identify purely diffuse points on the face and to simultaneously estimate this diffuse reflectance model. In the second step we apply non-linear optimization to fit a non-Lambertian reflectance model to the outliers of the previous step. The calibration procedure is validated with synthetic and real data.
Resumo:
This study extends a previous research concerning intervertebral motion registration by means of 2D dynamic fluoroscopy to obtain a more comprehensive 3D description of vertebral kinematics. The problem of estimating the 3D rigid pose of a CT volume of a vertebra from its 2D X-ray fluoroscopy projection is addressed. 2D-3D registration is obtained maximising a measure of similarity between Digitally Reconstructed Radiographs (obtained from the CT volume) and real fluoroscopic projection. X-ray energy correction was performed. To assess the method a calibration model was realised a sheep dry vertebra was rigidly fixed to a frame of reference including metallic markers. Accurate measurement of 3D orientation was obtained via single-camera calibration of the markers and held as true 3D vertebra position; then, vertebra 3D pose was estimated and results compared. Error analysis revealed accuracy of the order of 0.1 degree for the rotation angles of about 1mm for displacements parallel to the fluoroscopic plane, and of order of 10mm for the orthogonal displacement. © 2010 P. Bifulco et al.
Resumo:
Questa tesi si occupa dell’estensione di un framework software finalizzato all'individuazione e al tracciamento di persone in una scena ripresa da telecamera stereoscopica. In primo luogo è rimossa la necessità di una calibrazione manuale offline del sistema sfruttando algoritmi che consentono di individuare, a partire da un fotogramma acquisito dalla camera, il piano su cui i soggetti tracciati si muovono. Inoltre, è introdotto un modulo software basato su deep learning con lo scopo di migliorare la precisione del tracciamento. Questo componente, che è in grado di individuare le teste presenti in un fotogramma, consente ridurre i dati analizzati al solo intorno della posizione effettiva di una persona, escludendo oggetti che l’algoritmo di tracciamento sarebbe portato a individuare come persone.
Resumo:
This work explores the use of statistical methods in describing and estimating camera poses, as well as the information feedback loop between camera pose and object detection. Surging development in robotics and computer vision has pushed the need for algorithms that infer, understand, and utilize information about the position and orientation of the sensor platforms when observing and/or interacting with their environment.
The first contribution of this thesis is the development of a set of statistical tools for representing and estimating the uncertainty in object poses. A distribution for representing the joint uncertainty over multiple object positions and orientations is described, called the mirrored normal-Bingham distribution. This distribution generalizes both the normal distribution in Euclidean space, and the Bingham distribution on the unit hypersphere. It is shown to inherit many of the convenient properties of these special cases: it is the maximum-entropy distribution with fixed second moment, and there is a generalized Laplace approximation whose result is the mirrored normal-Bingham distribution. This distribution and approximation method are demonstrated by deriving the analytical approximation to the wrapped-normal distribution. Further, it is shown how these tools can be used to represent the uncertainty in the result of a bundle adjustment problem.
Another application of these methods is illustrated as part of a novel camera pose estimation algorithm based on object detections. The autocalibration task is formulated as a bundle adjustment problem using prior distributions over the 3D points to enforce the objects' structure and their relationship with the scene geometry. This framework is very flexible and enables the use of off-the-shelf computational tools to solve specialized autocalibration problems. Its performance is evaluated using a pedestrian detector to provide head and foot location observations, and it proves much faster and potentially more accurate than existing methods.
Finally, the information feedback loop between object detection and camera pose estimation is closed by utilizing camera pose information to improve object detection in scenarios with significant perspective warping. Methods are presented that allow the inverse perspective mapping traditionally applied to images to be applied instead to features computed from those images. For the special case of HOG-like features, which are used by many modern object detection systems, these methods are shown to provide substantial performance benefits over unadapted detectors while achieving real-time frame rates, orders of magnitude faster than comparable image warping methods.
The statistical tools and algorithms presented here are especially promising for mobile cameras, providing the ability to autocalibrate and adapt to the camera pose in real time. In addition, these methods have wide-ranging potential applications in diverse areas of computer vision, robotics, and imaging.
Resumo:
As complex radiotherapy techniques become more readily-practiced, comprehensive 3D dosimetry is a growing necessity for advanced quality assurance. However, clinical implementation has been impeded by a wide variety of factors, including the expense of dedicated optical dosimeter readout tools, high operational costs, and the overall difficulty of use. To address these issues, a novel dry-tank optical CT scanner was designed for PRESAGE 3D dosimeter readout, relying on 3D printed components and omitting costly parts from preceding optical scanners. This work details the design, prototyping, and basic commissioning of the Duke Integrated-lens Optical Scanner (DIOS).
The convex scanning geometry was designed in ScanSim, an in-house Monte Carlo optical ray-tracing simulation. ScanSim parameters were used to build a 3D rendering of a convex ‘solid tank’ for optical-CT, which is capable of collimating a point light source into telecentric geometry without significant quantities of refractive-index matched fluid. The model was 3D printed, processed, and converted into a negative mold via rubber casting to produce a transparent polyurethane scanning tank. The DIOS was assembled with the solid tank, a 3W red LED light source, a computer-controlled rotation stage, and a 12-bit CCD camera. Initial optical phantom studies show negligible spatial inaccuracies in 2D projection images and 3D tomographic reconstructions. A PRESAGE 3D dose measurement for a 4-field box treatment plan from Eclipse shows 95% of voxels passing gamma analysis at 3%/3mm criteria. Gamma analysis between tomographic images of the same dosimeter in the DIOS and DLOS systems show 93.1% agreement at 5%/1mm criteria. From this initial study, the DIOS has demonstrated promise as an economically-viable optical-CT scanner. However, further improvements will be necessary to fully develop this system into an accurate and reliable tool for advanced QA.
Pre-clinical animal studies are used as a conventional means of translational research, as a midpoint between in-vitro cell studies and clinical implementation. However, modern small animal radiotherapy platforms are primitive in comparison with conventional linear accelerators. This work also investigates a series of 3D printed tools to expand the treatment capabilities of the X-RAD 225Cx orthovoltage irradiator, and applies them to a feasibility study of hippocampal avoidance in rodent whole-brain radiotherapy.
As an alternative material to lead, a novel 3D-printable tungsten-composite ABS plastic, GMASS, was tested to create precisely-shaped blocks. Film studies show virtually all primary radiation at 225 kVp can be attenuated by GMASS blocks of 0.5cm thickness. A state-of-the-art software, BlockGen, was used to create custom hippocampus-shaped blocks from medical image data, for any possible axial treatment field arrangement. A custom 3D printed bite block was developed to immobilize and position a supine rat for optimal hippocampal conformity. An immobilized rat CT with digitally-inserted blocks was imported into the SmART-Plan Monte-Carlo simulation software to determine the optimal beam arrangement. Protocols with 4 and 7 equally-spaced fields were considered as viable treatment options, featuring improved hippocampal conformity and whole-brain coverage when compared to prior lateral-opposed protocols. Custom rodent-morphic PRESAGE dosimeters were developed to accurately reflect these treatment scenarios, and a 3D dosimetry study was performed to confirm the SmART-Plan simulations. Measured doses indicate significant hippocampal sparing and moderate whole-brain coverage.
Resumo:
Cigar Lake is a high-grade uranium deposit, located in northern Saskatchewan, Canada. In order to extract the uranium ore remotely, thus ensuring minimal radiation dose to workers and also to access the ore from stable ground, the Jet Boring System (JBS) was developed by Cameco Corporation. This system uses a high-powered water jet to remotely excavate cavities. Survey data is required to determine the final shape, volume, and location of the cavity for mine planning purposes and construction. This paper provides an overview of the challenges involved in remotely surveying a JBS-mined cavity and studies the potential use of a time-of-flight (ToF) camera for remote cavity surveying. It reports on data collected and analyzed from inside an experimental environment as well as on real data acquired on site from the Cigar Lake and Rabbit Lake mines.
Resumo:
Mid Sweden University is currently researching how to capture more of a scene with a camera and how to create 3D images that does not require extra equipment for the viewer. In the process of this research they have started looking into simulating some of the tests that they wish to conduct. The goal of this project is to research whether the 3D graphics engine Unity3D could be used to simulate these tests, and to what degree. To test this a simulation was designed and implemented. The simulation used a split display system where each camera is directly connected to a part of the screen and using the position of the viewer the correct part of the camera feed is shown. Some literary studies were also done into how current 3D technology works. The simulation was successfully implemented and shows that simple simulation can be done in Unity3D, however, some problems were encountered in the process. The conclusion of the project show that there is much work left before simulation is viable but that there is potential in the technology and that the research team should continue to investigate it.
Resumo:
analisi sperimentale e progettazione di cinematismi atti alla movimentazione di devices protesici all’interno di una camera di deposizione pvd. descrizione delle tecniche di deposizione PVD attualmente esistenti. deduzione e progetto ex novo, con l'ausilio di Autodesk Inventor (CAD 3D), di movimentazioni per particolari devices medici dalla geometria complessa all'interno di una speciale camera di deposizione.
Resumo:
Thesis (Master's)--University of Washington, 2016-01