18 resultados para 6DoF pose registration
em Universidad Politécnica de Madrid
Resumo:
In this paper, we seek to expand the use of direct methods in real-time applications by proposing a vision-based strategy for pose estimation of aerial vehicles. The vast majority of approaches make use of features to estimate motion. Conversely, the strategy we propose is based on a MR (Multi-Resolution) implementation of an image registration technique (Inverse Compositional Image Alignment ICIA) using direct methods. An on-board camera in a downwards-looking configuration, and the assumption of planar scenes, are the bases of the algorithm. The motion between frames (rotation and translation) is recovered by decomposing the frame-to-frame homography obtained by the ICIA algorithm applied to a patch that covers around the 80% of the image. When the visual estimation is required (e.g. GPS drop-out), this motion is integrated with the previous known estimation of the vehicles' state, obtained from the on-board sensors (GPS/IMU), and the subsequent estimations are based only on the vision-based motion estimations. The proposed strategy is tested with real flight data in representative stages of a flight: cruise, landing, and take-off, being two of those stages considered critical: take-off and landing. The performance of the pose estimation strategy is analyzed by comparing it with the GPS/IMU estimations. Results show correlation between the visual estimation obtained with the MR-ICIA and the GPS/IMU data, that demonstrate that the visual estimation can be used to provide a good approximation of the vehicle's state when it is required (e.g. GPS drop-outs). In terms of performance, the proposed strategy is able to maintain an estimation of the vehicle's state for more than one minute, at real-time frame rates based, only on visual information.
Resumo:
The aim of this work is to provide the necessary methods to register and fuse the endo-epicardial signal intensity (SI) maps extracted from contrast-enhanced magnetic resonance imaging (ceMRI) with X-ray coronary ngiograms using an intrinsic registrationbased algorithm to help pre-planning and guidance of catheterization procedures. Fusion of angiograms with SI maps was treated as a 2D-3D pose estimation, where each image point is projected to a Plücker line, and the screw representation for rigid motions is minimized using a gradient descent method. The resultant transformation is applied to the SI map that is then projected and fused on each angiogram. The proposed method was tested in clinical datasets from 6 patients with prior myocardial infarction. The registration procedure is optionally combined with an iterative closest point algorithm (ICP) that aligns the ventricular contours segmented from two ventriculograms.
Resumo:
This thesis deals with the problem of efficiently tracking 3D objects in sequences of images. We tackle the efficient 3D tracking problem by using direct image registration. This problem is posed as an iterative optimization procedure that minimizes a brightness error norm. We review the most popular iterative methods for image registration in the literature, turning our attention to those algorithms that use efficient optimization techniques. Two forms of efficient registration algorithms are investigated. The first type comprises the additive registration algorithms: these algorithms incrementally compute the motion parameters by linearly approximating the brightness error function. We centre our attention on Hager and Belhumeur’s factorization-based algorithm for image registration. We propose a fundamental requirement that factorization-based algorithms must satisfy to guarantee good convergence, and introduce a systematic procedure that automatically computes the factorization. Finally, we also bring out two warp functions to register rigid and nonrigid 3D targets that satisfy the requirement. The second type comprises the compositional registration algorithms, where the brightness function error is written by using function composition. We study the current approaches to compositional image alignment, and we emphasize the importance of the Inverse Compositional method, which is known to be the most efficient image registration algorithm. We introduce a new algorithm, the Efficient Forward Compositional image registration: this algorithm avoids the necessity of inverting the warping function, and provides a new interpretation of the working mechanisms of the inverse compositional alignment. By using this information, we propose two fundamental requirements that guarantee the convergence of compositional image registration methods. Finally, we support our claims by using extensive experimental testing with synthetic and real-world data. We propose a distinction between image registration and tracking when using efficient algorithms. We show that, depending whether the fundamental requirements are hold, some efficient algorithms are eligible for image registration but not for tracking.
Resumo:
In this paper, two techniques to control UAVs (Unmanned Aerial Vehicles), based on visual information are presented. The first one is based on the detection and tracking of planar structures from an on-board camera, while the second one is based on the detection and 3D reconstruction of the position of the UAV based on an external camera system. Both strategies are tested with a VTOL (Vertical take-off and landing) UAV, and results show good behavior of the visual systems (precision in the estimation and frame rate) when estimating the helicopter¿s position and using the extracted information to control the UAV.
Resumo:
In order to properly understand and model the gene regulatory networks in animals development, it is crucial to obtain detailed measurements, both in time and space, about their gene expression domains. In this paper, we propose a complete computational framework to fulfill this task and create a 3D Atlas of the early zebrafish embryogenesis annotated with both the cellular localizations and the level of expression of different genes at different developmental stages. The strategy to construct such an Atlas is described here with the expression pattern of 5 different genes at 6 hours of development post fertilization.
Resumo:
La planificación pre-operatoria se ha convertido en una tarea esencial en cirugías y terapias de marcada complejidad, especialmente aquellas relacionadas con órgano blando. Un ejemplo donde la planificación preoperatoria tiene gran interés es la cirugía hepática. Dicha planificación comprende la detección e identificación precisa de las lesiones individuales y vasos así como la correcta segmentación y estimación volumétrica del hígado funcional. Este proceso es muy importante porque determina tanto si el paciente es un candidato adecuado para terapia quirúrgica como la definición del abordaje a seguir en el procedimiento. La radioterapia de órgano blando es un segundo ejemplo donde la planificación se requiere tanto para la radioterapia externa convencional como para la radioterapia intraoperatoria. La planificación comprende la segmentación de tumor y órganos vulnerables y la estimación de la dosimetría. La segmentación de hígado funcional y la estimación volumétrica para planificación de la cirugía se estiman habitualmente a partir de imágenes de tomografía computarizada (TC). De igual modo, en la planificación de radioterapia, los objetivos de la radiación se delinean normalmente sobre TC. Sin embargo, los avances en las tecnologías de imagen de resonancia magnética (RM) están ofreciendo progresivamente ventajas adicionales. Por ejemplo, se ha visto que el ratio de detección de metástasis hepáticas es significativamente superior en RM con contraste Gd–EOB–DTPA que en TC. Por tanto, recientes estudios han destacado la importancia de combinar la información de TC y RM para conseguir el mayor nivel posible de precisión en radioterapia y para facilitar una descripción precisa de las lesiones del hígado. Con el objetivo de mejorar la planificación preoperatoria en ambos escenarios se precisa claramente de un algoritmo de registro no rígido de imagen. Sin embargo, la gran mayoría de sistemas comerciales solo proporcionan métodos de registro rígido. Las medidas de intensidad de voxel han demostrado ser criterios de similitud de imágenes robustos, y, entre ellas, la Información Mutua (IM) es siempre la primera elegida en registros multimodales. Sin embargo, uno de los principales problemas de la IM es la ausencia de información espacial y la asunción de que las relaciones estadísticas entre las imágenes son homogéneas a lo largo de su domino completo. La hipótesis de esta tesis es que la incorporación de información espacial de órganos al proceso de registro puede mejorar la robustez y calidad del mismo, beneficiándose de la disponibilidad de las segmentaciones clínicas. En este trabajo, se propone y valida un esquema de registro multimodal no rígido 3D usando una nueva métrica llamada Información Mutua Centrada en el Órgano (Organ-Focused Mutual Information metric (OF-MI)) y se compara con la formulación clásica de la Información Mutua. Esto permite mejorar los resultados del registro en áreas problemáticas incorporando información regional al criterio de similitud, beneficiándose de la disponibilidad real de segmentaciones en protocolos estándares clínicos, y permitiendo que la dependencia estadística entre las dos modalidades de imagen difiera entre órganos o regiones. El método propuesto se ha aplicado al registro de TC y RM con contraste Gd–EOB–DTPA así como al registro de imágenes de TC y MR para planificación de radioterapia intraoperatoria rectal. Adicionalmente, se ha desarrollado un algoritmo de apoyo de segmentación 3D basado en Level-Sets para la incorporación de la información de órgano en el registro. El algoritmo de segmentación se ha diseñado específicamente para la estimación volumétrica de hígado sano funcional y ha demostrado un buen funcionamiento en un conjunto de imágenes de TC abdominales. Los resultados muestran una mejora estadísticamente significativa de OF-MI comparada con la Información Mutua clásica en las medidas de calidad de los registros; tanto con datos simulados (p<0.001) como con datos reales en registro hepático de TC y RM con contraste Gd– EOB–DTPA y en registro para planificación de radioterapia rectal usando OF-MI multi-órgano (p<0.05). Adicionalmente, OF-MI presenta resultados más estables con menor dispersión que la Información Mutua y un comportamiento más robusto con respecto a cambios en la relación señal-ruido y a la variación de parámetros. La métrica OF-MI propuesta en esta tesis presenta siempre igual o mayor precisión que la clásica Información Mutua y consecuentemente puede ser una muy buena alternativa en aplicaciones donde la robustez del método y la facilidad en la elección de parámetros sean particularmente importantes. Abstract Pre-operative planning has become an essential task in complex surgeries and therapies, especially for those affecting soft tissue. One example where soft tissue preoperative planning is of high interest is liver surgery. It involves the accurate detection and identification of individual liver lesions and vessels as well as the proper functional liver segmentation and volume estimation. This process is very important because it determines whether the patient is a suitable candidate for surgical therapy and the type of procedure. Soft tissue radiation therapy is a second example where planning is required for both conventional external and intraoperative radiotherapy. It involves the segmentation of the tumor target and vulnerable organs and the estimation of the planned dose. Functional liver segmentations and volume estimations for surgery planning are commonly estimated from computed tomography (CT) images. Similarly, in radiation therapy planning, targets to be irradiated and healthy and vulnerable tissues to be protected from irradiation are commonly delineated on CT scans. However, developments in magnetic resonance imaging (MRI) technology are progressively offering advantages. For instance, the hepatic metastasis detection rate has been found to be significantly higher in Gd–EOB–DTPAenhanced MRI than in CT. Therefore, recent studies highlight the importance of combining the information from CT and MRI to achieve the highest level of accuracy in radiotherapy and to facilitate accurate liver lesion description. In order to improve those two soft tissue pre operative planning scenarios, an accurate nonrigid image registration algorithm is clearly required. However, the vast majority of commercial systems only provide rigid registration. Voxel intensity measures have been shown to be robust measures of image similarity, and among them, Mutual Information (MI) is always the first candidate in multimodal registrations. However, one of the main drawbacks of Mutual Information is the absence of spatial information and the assumption that statistical relationships between images are the same over the whole domain of the image. The hypothesis of the present thesis is that incorporating spatial organ information into the registration process may improve the registration robustness and quality, taking advantage of the clinical segmentations availability. In this work, a multimodal nonrigid 3D registration framework using a new Organ- Focused Mutual Information metric (OF-MI) is proposed, validated and compared to the classical formulation of the Mutual Information (MI). It allows improving registration results in problematic areas by adding regional information into the similitude criterion taking advantage of actual segmentations availability in standard clinical protocols and allowing the statistical dependence between the two modalities differ among organs or regions. The proposed method is applied to CT and T1 weighted delayed Gd–EOB–DTPA-enhanced MRI registration as well as to register CT and MRI images in rectal intraoperative radiotherapy planning. Additionally, a 3D support segmentation algorithm based on Level-Sets has been developed for the incorporation of the organ information into the registration. The segmentation algorithm has been specifically designed for the healthy and functional liver volume estimation demonstrating good performance in a set of abdominal CT studies. Results show a statistical significant improvement of registration quality measures with OF-MI compared to MI with both simulated data (p<0.001) and real data in liver applications registering CT and Gd–EOB–DTPA-enhanced MRI and in registration for rectal radiotherapy planning using multi-organ OF-MI (p<0.05). Additionally, OF-MI presents more stable results with smaller dispersion than MI and a more robust behavior with respect to SNR changes and parameters variation. The proposed OF-MI always presents equal or better accuracy than the classical MI and consequently can be a very convenient alternative within applications where the robustness of the method and the facility to choose the parameters are particularly important.
Resumo:
Purpose: Accurate delineation of the rectum is of high importance in off-line adaptive radiation therapy since it is a major dose-limiting organ in prostate cancer radiotherapy. The intensity-based deformable image registration (DIR) methods cannot create a correct spatial transformation if there is no correspondence between the template and the target images. The variation of rectal filling, gas, or feces, creates a noncorrespondence in image intensities that becomes a great obstacle for intensity-based DIR. Methods: In this study the authors have designed and implemented a semiautomatic method to create a rectum mask in pelvic computed tomography (CT) images. The method, that includes a DIR based on the demons algorithm, has been tested in 13 prostate cancer cases, each comprising of two CT scans, for a total of 26 CT scans. Results: The use of the manual segmentation in the planning image and the proposed rectum mask method (RMM) method in the daily image leads to an improvement in the DIR performance in pelvic CT images, obtaining a mean value of overlap volume index = 0.89, close to the values obtained using the manual segmentations in both images. Conclusions: The application of the RMM method in the daily image and the manual segmentations in the planning image during prostate cancer treatments increases the performance of the registration in presence of rectal fillings, obtaining very good agreement with a physician's manual contours.
Resumo:
Purpose: Accurate delineation of the rectum is of high importance in off-line adaptive radiation therapy since it is a major dose-limiting organ in prostate cancer radiotherapy. The intensity-based deformable image registration (DIR) methods cannot create a correct spatial transformation if there is no correspondence between the template and the target images. The variation of rectal filling, gas, or feces, creates a noncorrespondence in image intensities that becomes a great obstacle for intensity-based DIR. Methods: In this study the authors have designed and implemented a semiautomatic method to create a rectum mask in pelvic computed tomography (CT) images. The method, that includes a DIR based on the demons algorithm, has been tested in 13 prostate cancer cases, each comprising of two CT scans, for a total of 26 CT scans. Results: The use of the manual segmentation in the planning image and the proposed rectum mask method (RMM) method in the daily image leads to an improvement in the DIR performance in pelvic CT images, obtaining a mean value of overlap volume index = 0.89, close to the values obtained using the manual segmentations in both images. Conclusions: The application of the RMM method in the daily image and the manual segmentations in the planning image during prostate cancer treatments increases the performance of the registration in presence of rectal fillings, obtaining very good agreement with a physician's manual contours.
Resumo:
Laparoscopic instrument tracking systems are an essential component in image-guided interventions and offer new possibilities to improve and automate objective assessment methods of surgical skills. In this study we present our system design to apply a third generation optical pose tracker (Micron- Tracker®) to laparoscopic practice. A technical evaluation of this design is performed in order to analyze its accuracy in computing the laparoscopic instrument tip position. Results show a stable fluctuation error over the entire analyzed workspace. The relative position errors are 1.776±1.675 mm, 1.817±1.762 mm, 1.854±1.740 mm, 2.455±2.164 mm, 2.545±2.496 mm, 2.764±2.342 mm, 2.512±2.493 mm for distances of 50, 100, 150, 200, 250, 300, and 350 mm, respectively. The accumulated distance error increases with the measured distance. The instrument inclination covered by the system is high, from 90 to 7.5 degrees. The system reports a low positional accuracy for the instrument tip.
Resumo:
El principal objetivo de esta tesis es dotar a los vehículos aéreos no tripulados (UAVs, por sus siglas en inglés) de una fuente de información adicional basada en visión. Esta fuente de información proviene de cámaras ubicadas a bordo de los vehículos o en el suelo. Con ella se busca que los UAVs realicen tareas de aterrizaje o inspección guiados por visión, especialmente en aquellas situaciones en las que no haya disponibilidad de estimar la posición del vehículo con base en GPS, cuando las estimaciones de GPS no tengan la suficiente precisión requerida por las tareas a realizar, o cuando restricciones de carga de pago impidan añadir sensores a bordo de los vehículos. Esta tesis trata con tres de las principales áreas de la visión por computador: seguimiento visual y estimación visual de la pose (posición y orientación), que a su vez constituyen la base de la tercera, denominada control servo visual, que en nuestra aplicación se enfoca en el empleo de información visual para controlar los UAVs. Al respecto, esta tesis se ocupa de presentar propuestas novedosas que permitan solucionar problemas relativos al seguimiento de objetos mediante cámaras ubicadas a bordo de los UAVs, se ocupa de la estimación de la pose de los UAVs basada en información visual obtenida por cámaras ubicadas en el suelo o a bordo, y también se ocupa de la aplicación de las técnicas propuestas para solucionar diferentes problemas, como aquellos concernientes al seguimiento visual para tareas de reabastecimiento autónomo en vuelo o al aterrizaje basado en visión, entre otros. Las diversas técnicas de visión por computador presentadas en esta tesis se proponen con el fin de solucionar dificultades que suelen presentarse cuando se realizan tareas basadas en visión con UAVs, como las relativas a la obtención, en tiempo real, de estimaciones robustas, o como problemas generados por vibraciones. Los algoritmos propuestos en esta tesis han sido probados con información de imágenes reales obtenidas realizando pruebas on-line y off-line. Diversos mecanismos de evaluación han sido empleados con el propósito de analizar el desempeño de los algoritmos propuestos, entre los que se incluyen datos simulados, imágenes de vuelos reales, estimaciones precisas de posición empleando el sistema VICON y comparaciones con algoritmos del estado del arte. Los resultados obtenidos indican que los algoritmos de visión por computador propuestos tienen un desempeño que es comparable e incluso mejor al de algoritmos que se encuentran en el estado del arte. Los algoritmos propuestos permiten la obtención de estimaciones robustas en tiempo real, lo cual permite su uso en tareas de control visual. El desempeño de estos algoritmos es apropiado para las exigencias de las distintas aplicaciones examinadas: reabastecimiento autónomo en vuelo, aterrizaje y estimación del estado del UAV. Abstract The main objective of this thesis is to provide Unmanned Aerial Vehicles (UAVs) with an additional vision-based source of information extracted by cameras located either on-board or on the ground, in order to allow UAVs to develop visually guided tasks, such as landing or inspection, especially in situations where GPS information is not available, where GPS-based position estimation is not accurate enough for the task to develop, or where payload restrictions do not allow the incorporation of additional sensors on-board. This thesis covers three of the main computer vision areas: visual tracking and visual pose estimation, which are the bases the third one called visual servoing, which, in this work, focuses on using visual information to control UAVs. In this sense, the thesis focuses on presenting novel solutions for solving the tracking problem of objects when using cameras on-board UAVs, on estimating the pose of the UAVs based on the visual information collected by cameras located either on the ground or on-board, and also focuses on applying these proposed techniques for solving different problems, such as visual tracking for aerial refuelling or vision-based landing, among others. The different computer vision techniques presented in this thesis are proposed to solve some of the frequently problems found when addressing vision-based tasks in UAVs, such as obtaining robust vision-based estimations at real-time frame rates, and problems caused by vibrations, or 3D motion. All the proposed algorithms have been tested with real-image data in on-line and off-line tests. Different evaluation mechanisms have been used to analyze the performance of the proposed algorithms, such as simulated data, images from real-flight tests, publicly available datasets, manually generated ground truth data, accurate position estimations using a VICON system and a robotic cell, and comparison with state of the art algorithms. Results show that the proposed computer vision algorithms obtain performances that are comparable to, or even better than, state of the art algorithms, obtaining robust estimations at real-time frame rates. This proves that the proposed techniques are fast enough for vision-based control tasks. Therefore, the performance of the proposed vision algorithms has shown to be of a standard appropriate to the different explored applications: aerial refuelling and landing, and state estimation. It is noteworthy that they have low computational overheads for vision systems.
Resumo:
Accurate detection of liver lesions is of great importance in hepatic surgery planning. Recent studies have shown that the detection rate of liver lesions is significantly higher in gadoxetic acid-enhanced magnetic resonance imaging (Gd–EOB–DTPA-enhanced MRI) than in contrast-enhanced portal-phase computed tomography (CT); however, the latter remains essential because of its high specificity, good performance in estimating liver volumes and better vessel visibility. To characterize liver lesions using both the above image modalities, we propose a multimodal nonrigid registration framework using organ-focused mutual information (OF-MI). This proposal tries to improve mutual information (MI) based registration by adding spatial information, benefiting from the availability of expert liver segmentation in clinical protocols. The incorporation of an additional information channel containing liver segmentation information was studied. A dataset of real clinical images and simulated images was used in the validation process. A Gd–EOB–DTPA-enhanced MRI simulation framework is presented. To evaluate results, warping index errors were calculated for the simulated data, and landmark-based and surface-based errors were calculated for the real data. An improvement of the registration accuracy for OF-MI as compared with MI was found for both simulated and real datasets. Statistical significance of the difference was tested and confirmed in the simulated dataset (p < 0.01).
Resumo:
In the context of aerial imagery, one of the first steps toward a coherent processing of the information contained in multiple images is geo-registration, which consists in assigning geographic 3D coordinates to the pixels of the image. This enables accurate alignment and geo-positioning of multiple images, detection of moving objects and fusion of data acquired from multiple sensors. To solve this problem there are different approaches that require, in addition to a precise characterization of the camera sensor, high resolution referenced images or terrain elevation models, which are usually not publicly available or out of date. Building upon the idea of developing technology that does not need a reference terrain elevation model, we propose a geo-registration technique that applies variational methods to obtain a dense and coherent surface elevation model that is used to replace the reference model. The surface elevation model is built by interpolation of scattered 3D points, which are obtained in a two-step process following a classical stereo pipeline: first, coherent disparity maps between image pairs of a video sequence are estimated and then image point correspondences are back-projected. The proposed variational method enforces continuity of the disparity map not only along epipolar lines (as done by previous geo-registration techniques) but also across them, in the full 2D image domain. In the experiments, aerial images from synthetic video sequences have been used to validate the proposed technique.
Resumo:
Subtraction of Ictal SPECT Co-registered to MRI (SISCOM) is an imaging technique used to localize the epileptogenic focus in patients with intractable partial epilepsy. The aim of this study was to determine the accuracy of registration algorithms involved in SISCOM analysis using FocusDET, a new user-friendly application. To this end, Monte Carlo simulation was employed to generate realistic SPECT studies. Simulated sinograms were reconstructed by using the Filtered BackProjection (FBP) algorithm and an Ordered Subsets Expectation Maximization (OSEM) reconstruction method that included compensation for all degradations. Registration errors in SPECT-SPECT and SPECT-MRI registration were evaluated by comparing the theoretical and actual transforms. Patient studies with well-localized epilepsy were also included in the registration assessment. Global registration errors including SPECT-SPECT and SPECT-MRI registration errors were less than 1.2 mm on average, exceeding the voxel size (3.32 mm) of SPECT studies in no case. Although images reconstructed using OSEM led to lower registration errors than images reconstructed with FBP, differences after using OSEM or FBP in reconstruction were less than 0.2 mm on average. This indicates that correction for degradations does not play a major role in the SISCOM process, thereby facilitating the application of the methodology in centers where OSEM is not implemented with correction of all degradations. These findings together with those obtained by clinicians from patients via MRI, interictal and ictal SPECT and video-EEG, show that FocusDET is a robust application for performing SISCOM analysis in clinical practice.
Resumo:
Markerless video-based human pose estimation algorithms face a high-dimensional problem that is frequently broken down into several lower-dimensional ones by estimating the pose of each limb separately. However, in order to do so they need to reliably locate the torso, for which they typically rely on time coherence and tracking algorithms. Their losing track usually results in catastrophic failure of the process, requiring human intervention and thus precluding their usage in real-time applications. We propose a very fast rough pose estimation scheme based on global shape descriptors built on 3D Zernike moments. Using an articulated model that we configure in many poses, a large database of descriptor/pose pairs can be computed off-line. Thus, the only steps that must be done on-line are the extraction of the descriptors for each input volume and a search against the database to get the most likely poses. While the result of such process is not a fine pose estimation, it can be useful to help more sophisticated algorithms to regain track or make more educated guesses when creating new particles in particle-filter-based tracking schemes. We have achieved a performance of about ten fps on a single computer using a database of about one million entries.
Resumo:
The aim of this work is an approach using multisensor remote sensing techniques to recognize the potential remains and recreate the original landscape of three archaeological sites. We investigate the spectral characteristics of the reflectance parameter and emissivity in the pattern recognition of archaeological materials in several hyperspectral scenes of the prehispanic site in Palmar Sur (Costa Rica), the Jarama Valley site and the celtiberian city of Segeda in Spain. Spectral ranges of the visible-near infrared (VNIR), shortwave infrared (SWIR) and thermal infrared (TIR) from hyperspectral data cubes of HyMAP, AHS, MASTER and ATM have been used. Several experiments on natural scenarios of Costa Rica and Spain of different complexity, have been designed. Spectral patterns and thermal anomalies have been calculated as evidences of buried remains and change detection. First results, land cover change analyses and their consequences in the digital heritage registration are discussed.