847 resultados para pose estimation
Resumo:
The success of dental implant-supported prosthesis is directly linked to the accuracy obtained during implant’s pose estimation (position and orientation). Although traditional impression techniques and recent digital acquisition methods are acceptably accurate, a simultaneously fast, accurate and operator-independent methodology is still lacking. Hereto, an image-based framework is proposed to estimate the patient-specific implant’s pose using cone-beam computed tomography (CBCT) and prior knowledge of implanted model. The pose estimation is accomplished in a threestep approach: (1) a region-of-interest is extracted from the CBCT data using 2 operator-defined points at the implant’s main axis; (2) a simulated CBCT volume of the known implanted model is generated through Feldkamp-Davis-Kress reconstruction and coarsely aligned to the defined axis; and (3) a voxel-based rigid registration is performed to optimally align both patient and simulated CBCT data, extracting the implant’s pose from the optimal transformation. Three experiments were performed to evaluate the framework: (1) an in silico study using 48 implants distributed through 12 tridimensional synthetic mandibular models; (2) an in vitro study using an artificial mandible with 2 dental implants acquired with an i-CAT system; and (3) two clinical case studies. The results shown positional errors of 67±34μm and 108μm, and angular misfits of 0.15±0.08º and 1.4º, for experiment 1 and 2, respectively. Moreover, in experiment 3, visual assessment of clinical data results shown a coherent alignment of the reference implant. Overall, a novel image-based framework for implants’ pose estimation from CBCT data was proposed, showing accurate results in agreement with dental prosthesis modelling requirements.
Resumo:
Os métodos clínicos que são realizados com recurso a tecnologias de imagiologia têm registado um aumento de popularidade nas últimas duas décadas. Os procedimentos tradicionais usados em cirurgia têm sido substituídos por métodos minimamente invasivos de forma a conseguir diminuir os custos associados e aperfeiçoar factores relacionados com a produtividade. Procedimentos clínicos modernos como a broncoscopia e a cardiologia são caracterizados por se focarem na minimização de acções invasivas, com os arcos em ‘C’ a adoptarem um papel relevante nesta área. Apesar de o arco em ‘C’ ser uma tecnologia amplamente utilizada no auxílio da navegação em intervenções minimamente invasivas, este falha na qualidade da informação fornecida ao cirurgião. A informação obtida em duas dimensões não é suficiente para proporcionar uma compreensão total da localização tridimensional da região de interesse, revelando-se como uma tarefa essencial o estabelecimento de um método que permita a aquisição de informação tridimensional. O primeiro passo para alcançar este objectivo foi dado ao definir um método que permite a estimativa da posição e orientação de um objecto em relação ao arco em ‘C’. De forma a realizar os testes com o arco em ‘C’, a geometria deste teve que ser inicialmente definida e a calibração do sistema feita. O trabalho desenvolvido e apresentado nesta tese foca-se num método que provou ser suficientemente sustentável e eficiente para se estabelecer como um ponto de partida no caminho para alcançar o objectivo principal: o desenvolvimento de uma técnica que permita o aperfeiçoamento da qualidade da informação adquirida com o arco em ‘C’ durante uma intervenção clínica.
Resumo:
NOGUEIRA, Marcelo B. ; MEDEIROS, Adelardo A. D. ; ALSINA, Pablo J. Pose Estimation of a Humanoid Robot Using Images from an Mobile Extern Camera. In: IFAC WORKSHOP ON MULTIVEHICLE SYSTEMS, 2006, Salvador, BA. Anais... Salvador: MVS 2006, 2006.
Resumo:
In this paper, we seek to expand the use of direct methods in real-time applications by proposing a vision-based strategy for pose estimation of aerial vehicles. The vast majority of approaches make use of features to estimate motion. Conversely, the strategy we propose is based on a MR (Multi-Resolution) implementation of an image registration technique (Inverse Compositional Image Alignment ICIA) using direct methods. An on-board camera in a downwards-looking configuration, and the assumption of planar scenes, are the bases of the algorithm. The motion between frames (rotation and translation) is recovered by decomposing the frame-to-frame homography obtained by the ICIA algorithm applied to a patch that covers around the 80% of the image. When the visual estimation is required (e.g. GPS drop-out), this motion is integrated with the previous known estimation of the vehicles' state, obtained from the on-board sensors (GPS/IMU), and the subsequent estimations are based only on the vision-based motion estimations. The proposed strategy is tested with real flight data in representative stages of a flight: cruise, landing, and take-off, being two of those stages considered critical: take-off and landing. The performance of the pose estimation strategy is analyzed by comparing it with the GPS/IMU estimations. Results show correlation between the visual estimation obtained with the MR-ICIA and the GPS/IMU data, that demonstrate that the visual estimation can be used to provide a good approximation of the vehicle's state when it is required (e.g. GPS drop-outs). In terms of performance, the proposed strategy is able to maintain an estimation of the vehicle's state for more than one minute, at real-time frame rates based, only on visual information.
Resumo:
In this paper, two techniques to control UAVs (Unmanned Aerial Vehicles), based on visual information are presented. The first one is based on the detection and tracking of planar structures from an on-board camera, while the second one is based on the detection and 3D reconstruction of the position of the UAV based on an external camera system. Both strategies are tested with a VTOL (Vertical take-off and landing) UAV, and results show good behavior of the visual systems (precision in the estimation and frame rate) when estimating the helicopter¿s position and using the extracted information to control the UAV.
Resumo:
El principal objetivo de esta tesis es dotar a los vehículos aéreos no tripulados (UAVs, por sus siglas en inglés) de una fuente de información adicional basada en visión. Esta fuente de información proviene de cámaras ubicadas a bordo de los vehículos o en el suelo. Con ella se busca que los UAVs realicen tareas de aterrizaje o inspección guiados por visión, especialmente en aquellas situaciones en las que no haya disponibilidad de estimar la posición del vehículo con base en GPS, cuando las estimaciones de GPS no tengan la suficiente precisión requerida por las tareas a realizar, o cuando restricciones de carga de pago impidan añadir sensores a bordo de los vehículos. Esta tesis trata con tres de las principales áreas de la visión por computador: seguimiento visual y estimación visual de la pose (posición y orientación), que a su vez constituyen la base de la tercera, denominada control servo visual, que en nuestra aplicación se enfoca en el empleo de información visual para controlar los UAVs. Al respecto, esta tesis se ocupa de presentar propuestas novedosas que permitan solucionar problemas relativos al seguimiento de objetos mediante cámaras ubicadas a bordo de los UAVs, se ocupa de la estimación de la pose de los UAVs basada en información visual obtenida por cámaras ubicadas en el suelo o a bordo, y también se ocupa de la aplicación de las técnicas propuestas para solucionar diferentes problemas, como aquellos concernientes al seguimiento visual para tareas de reabastecimiento autónomo en vuelo o al aterrizaje basado en visión, entre otros. Las diversas técnicas de visión por computador presentadas en esta tesis se proponen con el fin de solucionar dificultades que suelen presentarse cuando se realizan tareas basadas en visión con UAVs, como las relativas a la obtención, en tiempo real, de estimaciones robustas, o como problemas generados por vibraciones. Los algoritmos propuestos en esta tesis han sido probados con información de imágenes reales obtenidas realizando pruebas on-line y off-line. Diversos mecanismos de evaluación han sido empleados con el propósito de analizar el desempeño de los algoritmos propuestos, entre los que se incluyen datos simulados, imágenes de vuelos reales, estimaciones precisas de posición empleando el sistema VICON y comparaciones con algoritmos del estado del arte. Los resultados obtenidos indican que los algoritmos de visión por computador propuestos tienen un desempeño que es comparable e incluso mejor al de algoritmos que se encuentran en el estado del arte. Los algoritmos propuestos permiten la obtención de estimaciones robustas en tiempo real, lo cual permite su uso en tareas de control visual. El desempeño de estos algoritmos es apropiado para las exigencias de las distintas aplicaciones examinadas: reabastecimiento autónomo en vuelo, aterrizaje y estimación del estado del UAV. Abstract The main objective of this thesis is to provide Unmanned Aerial Vehicles (UAVs) with an additional vision-based source of information extracted by cameras located either on-board or on the ground, in order to allow UAVs to develop visually guided tasks, such as landing or inspection, especially in situations where GPS information is not available, where GPS-based position estimation is not accurate enough for the task to develop, or where payload restrictions do not allow the incorporation of additional sensors on-board. This thesis covers three of the main computer vision areas: visual tracking and visual pose estimation, which are the bases the third one called visual servoing, which, in this work, focuses on using visual information to control UAVs. In this sense, the thesis focuses on presenting novel solutions for solving the tracking problem of objects when using cameras on-board UAVs, on estimating the pose of the UAVs based on the visual information collected by cameras located either on the ground or on-board, and also focuses on applying these proposed techniques for solving different problems, such as visual tracking for aerial refuelling or vision-based landing, among others. The different computer vision techniques presented in this thesis are proposed to solve some of the frequently problems found when addressing vision-based tasks in UAVs, such as obtaining robust vision-based estimations at real-time frame rates, and problems caused by vibrations, or 3D motion. All the proposed algorithms have been tested with real-image data in on-line and off-line tests. Different evaluation mechanisms have been used to analyze the performance of the proposed algorithms, such as simulated data, images from real-flight tests, publicly available datasets, manually generated ground truth data, accurate position estimations using a VICON system and a robotic cell, and comparison with state of the art algorithms. Results show that the proposed computer vision algorithms obtain performances that are comparable to, or even better than, state of the art algorithms, obtaining robust estimations at real-time frame rates. This proves that the proposed techniques are fast enough for vision-based control tasks. Therefore, the performance of the proposed vision algorithms has shown to be of a standard appropriate to the different explored applications: aerial refuelling and landing, and state estimation. It is noteworthy that they have low computational overheads for vision systems.
Resumo:
Markerless video-based human pose estimation algorithms face a high-dimensional problem that is frequently broken down into several lower-dimensional ones by estimating the pose of each limb separately. However, in order to do so they need to reliably locate the torso, for which they typically rely on time coherence and tracking algorithms. Their losing track usually results in catastrophic failure of the process, requiring human intervention and thus precluding their usage in real-time applications. We propose a very fast rough pose estimation scheme based on global shape descriptors built on 3D Zernike moments. Using an articulated model that we configure in many poses, a large database of descriptor/pose pairs can be computed off-line. Thus, the only steps that must be done on-line are the extraction of the descriptors for each input volume and a search against the database to get the most likely poses. While the result of such process is not a fine pose estimation, it can be useful to help more sophisticated algorithms to regain track or make more educated guesses when creating new particles in particle-filter-based tracking schemes. We have achieved a performance of about ten fps on a single computer using a database of about one million entries.
Resumo:
We propose the design of a real-time system to recognize and interprethand gestures. The acquisition devices are low cost 3D sensors. 3D hand pose will be segmented, characterized and track using growing neural gas (GNG) structure. The capacity of the system to obtain information with a high degree of freedom allows the encoding of many gestures and a very accurate motion capture. The use of hand pose models combined with motion information provide with GNG permits to deal with the problem of the hand motion representation. A natural interface applied to a virtual mirrorwriting system and to a system to estimate hand pose will be designed to demonstrate the validity of the system.
Resumo:
NOGUEIRA, Marcelo B. ; MEDEIROS, Adelardo A. D. ; ALSINA, Pablo J. Pose Estimation of a Humanoid Robot Using Images from an Mobile Extern Camera. In: IFAC WORKSHOP ON MULTIVEHICLE SYSTEMS, 2006, Salvador, BA. Anais... Salvador: MVS 2006, 2006.
Resumo:
The estimating of the relative orientation and position of a camera is one of the integral topics in the field of computer vision. The accuracy of a certain Finnish technology company’s traffic sign inventory and localization process can be improved by utilizing the aforementioned concept. The company’s localization process uses video data produced by a vehicle installed camera. The accuracy of estimated traffic sign locations depends on the relative orientation between the camera and the vehicle. This thesis proposes a computer vision based software solution which can estimate a camera’s orientation relative to the movement direction of the vehicle by utilizing video data. The task was solved by using feature-based methods and open source software. When using simulated data sets, the camera orientation estimates had an absolute error of 0.31 degrees on average. The software solution can be integrated to be a part of the traffic sign localization pipeline of the company in question.
Resumo:
NOGUEIRA, Marcelo B. ; MEDEIROS, Adelardo A. D. ; ALSINA, Pablo J. Pose Estimation of a Humanoid Robot Using Images from an Mobile Extern Camera. In: IFAC WORKSHOP ON MULTIVEHICLE SYSTEMS, 2006, Salvador, BA. Anais... Salvador: MVS 2006, 2006.
Resumo:
A camera maps 3-dimensional (3D) world space to a 2-dimensional (2D) image space. In the process it loses the depth information, i.e., the distance from the camera focal point to the imaged objects. It is impossible to recover this information from a single image. However, by using two or more images from different viewing angles this information can be recovered, which in turn can be used to obtain the pose (position and orientation) of the camera. Using this pose, a 3D reconstruction of imaged objects in the world can be computed. Numerous algorithms have been proposed and implemented to solve the above problem; these algorithms are commonly called Structure from Motion (SfM). State-of-the-art SfM techniques have been shown to give promising results. However, unlike a Global Positioning System (GPS) or an Inertial Measurement Unit (IMU) which directly give the position and orientation respectively, the camera system estimates it after implementing SfM as mentioned above. This makes the pose obtained from a camera highly sensitive to the images captured and other effects, such as low lighting conditions, poor focus or improper viewing angles. In some applications, for example, an Unmanned Aerial Vehicle (UAV) inspecting a bridge or a robot mapping an environment using Simultaneous Localization and Mapping (SLAM), it is often difficult to capture images with ideal conditions. This report examines the use of SfM methods in such applications and the role of combining multiple sensors, viz., sensor fusion, to achieve more accurate and usable position and reconstruction information. This project investigates the role of sensor fusion in accurately estimating the pose of a camera for the application of 3D reconstruction of a scene. The first set of experiments is conducted in a motion capture room. These results are assumed as ground truth in order to evaluate the strengths and weaknesses of each sensor and to map their coordinate systems. Then a number of scenarios are targeted where SfM fails. The pose estimates obtained from SfM are replaced by those obtained from other sensors and the 3D reconstruction is completed. Quantitative and qualitative comparisons are made between the 3D reconstruction obtained by using only a camera versus that obtained by using the camera along with a LIDAR and/or an IMU. Additionally, the project also works towards the performance issue faced while handling large data sets of high-resolution images by implementing the system on the Superior high performance computing cluster at Michigan Technological University.
Resumo:
In this paper we present a convolutional neuralnetwork (CNN)-based model for human head pose estimation inlow-resolution multi-modal RGB-D data. We pose the problemas one of classification of human gazing direction. We furtherfine-tune a regressor based on the learned deep classifier. Next wecombine the two models (classification and regression) to estimateapproximate regression confidence. We present state-of-the-artresults in datasets that span the range of high-resolution humanrobot interaction (close up faces plus depth information) data tochallenging low resolution outdoor surveillance data. We buildupon our robust head-pose estimation and further introduce anew visual attention model to recover interaction with theenvironment. Using this probabilistic model, we show thatmany higher level scene understanding like human-human/sceneinteraction detection can be achieved. Our solution runs inreal-time on commercial hardware