993 resultados para visual odometry algorithms


Relevância:

80.00% 80.00%

Publicador:

Resumo:

La percepción de profundidad se hace imprescindible en muchas tareas de manipulación, control visual y navegación de robots. Las cámaras de tiempo de vuelo (ToF: Time of Flight) generan imágenes de rango que proporcionan medidas de profundidad en tiempo real. No obstante, el parámetro distancia que calculan estas cámaras es fuertemente dependiente del tiempo de integración que se configura en el sensor y de la frecuencia de modulación empleada por el sistema de iluminación que integran. En este artículo, se presenta una metodología para el ajuste adaptativo del tiempo de integración y un análisis experimental del comportamiento de una cámara ToF cuando se modifica la frecuencia de modulación. Este método ha sido probado con éxito en algoritmos de control visual con arquitectura ‘eye-in-hand’ donde el sistema sensorial está compuesto por una cámara ToF. Además, la misma metodología puede ser aplicada en otros escenarios de trabajo.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Registration of point clouds captured by depth sensors is an important task in 3D reconstruction applications based on computer vision. In many applications with strict performance requirements, the registration should be executed not only with precision, but also in the same frequency as data is acquired by the sensor. This thesis proposes theuse of the pyramidal sparse optical flow algorithm to incrementally register point clouds captured by RGB-D sensors (e.g. Microsoft Kinect) in real time. The accumulated errorinherent to the process is posteriorly minimized by utilizing a marker and pose graph optimization. Experimental results gathered by processing several RGB-D datasets validatethe system proposed by this thesis in visual odometry and simultaneous localization and mapping (SLAM) applications.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

With the world of professional sports shifting towards employing better sport analytics, the demand for vision-based performance analysis is growing increasingly in recent years. In addition, the nature of many sports does not allow the use of any kind of sensors or other wearable markers attached to players for monitoring their performances during competitions. This provides a potential application of systematic observations such as tracking information of the players to help coaches to develop their visual skills and perceptual awareness needed to make decisions about team strategy or training plans. My PhD project is part of a bigger ongoing project between sport scientists and computer scientists involving also industry partners and sports organisations. The overall idea is to investigate the contribution technology can make to the analysis of sports performance on the example of team sports such as rugby, football or hockey. A particular focus is on vision-based tracking, so that information about the location and dynamics of the players can be gained without any additional sensors on the players. To start with, prior approaches on visual tracking are extensively reviewed and analysed. In this thesis, methods to deal with the difficulties in visual tracking to handle the target appearance changes caused by intrinsic (e.g. pose variation) and extrinsic factors, such as occlusion, are proposed. This analysis highlights the importance of the proposed visual tracking algorithms, which reflect these challenges and suggest robust and accurate frameworks to estimate the target state in a complex tracking scenario such as a sports scene, thereby facilitating the tracking process. Next, a framework for continuously tracking multiple targets is proposed. Compared to single target tracking, multi-target tracking such as tracking the players on a sports field, poses additional difficulties, namely data association, which needs to be addressed. Here, the aim is to locate all targets of interest, inferring their trajectories and deciding which observation corresponds to which target trajectory is. In this thesis, an efficient framework is proposed to handle this particular problem, especially in sport scenes, where the players of the same team tend to look similar and exhibit complex interactions and unpredictable movements resulting in matching ambiguity between the players. The presented approach is also evaluated on different sports datasets and shows promising results. Finally, information from the proposed tracking system is utilised as the basic input for further higher level performance analysis such as tactics and team formations, which can help coaches to design a better training plan. Due to the continuous nature of many team sports (e.g. soccer, hockey), it is not straightforward to infer the high-level team behaviours, such as players’ interaction. The proposed framework relies on two distinct levels of performance analysis: low-level performance analysis, such as identifying players positions on the play field, as well as a high-level analysis, where the aim is to estimate the density of player locations or detecting their possible interaction group. The related experiments show the proposed approach can effectively explore this high-level information, which has many potential applications.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We introduce a flexible visual data mining framework which combines advanced projection algorithms from the machine learning domain and visual techniques developed in the information visualization domain. The advantage of such an interface is that the user is directly involved in the data mining process. We integrate principled projection algorithms, such as generative topographic mapping (GTM) and hierarchical GTM (HGTM), with powerful visual techniques, such as magnification factors, directional curvatures, parallel coordinates and billboarding, to provide a visual data mining framework. Results on a real-life chemoinformatics dataset using GTM are promising and have been analytically compared with the results from the traditional projection methods. It is also shown that the HGTM algorithm provides additional value for large datasets. The computational complexity of these algorithms is discussed to demonstrate their suitability for the visual data mining framework. Copyright 2006 ACM.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Diabetic Retinopathy (DR) is a complication of diabetes that can lead to blindness if not readily discovered. Automated screening algorithms have the potential to improve identification of patients who need further medical attention. However, the identification of lesions must be accurate to be useful for clinical application. The bag-of-visual-words (BoVW) algorithm employs a maximum-margin classifier in a flexible framework that is able to detect the most common DR-related lesions such as microaneurysms, cotton-wool spots and hard exudates. BoVW allows to bypass the need for pre- and post-processing of the retinographic images, as well as the need of specific ad hoc techniques for identification of each type of lesion. An extensive evaluation of the BoVW model, using three large retinograph datasets (DR1, DR2 and Messidor) with different resolution and collected by different healthcare personnel, was performed. The results demonstrate that the BoVW classification approach can identify different lesions within an image without having to utilize different algorithms for each lesion reducing processing time and providing a more flexible diagnostic system. Our BoVW scheme is based on sparse low-level feature detection with a Speeded-Up Robust Features (SURF) local descriptor, and mid-level features based on semi-soft coding with max pooling. The best BoVW representation for retinal image classification was an area under the receiver operating characteristic curve (AUC-ROC) of 97.8% (exudates) and 93.5% (red lesions), applying a cross-dataset validation protocol. To assess the accuracy for detecting cases that require referral within one year, the sparse extraction technique associated with semi-soft coding and max pooling obtained an AUC of 94.2 ± 2.0%, outperforming current methods. Those results indicate that, for retinal image classification tasks in clinical practice, BoVW is equal and, in some instances, surpasses results obtained using dense detection (widely believed to be the best choice in many vision problems) for the low-level descriptors.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper delineates the development of a prototype hybrid knowledge-based system for the optimum design of liquid retaining structures by coupling the blackboard architecture, an expert system shell VISUAL RULE STUDIO and genetic algorithm (GA). Through custom-built interactive graphical user interfaces under a user-friendly environment, the user is directed throughout the design process, which includes preliminary design, load specification, model generation, finite element analysis, code compliance checking, and member sizing optimization. For structural optimization, GA is applied to the minimum cost design of structural systems with discrete reinforced concrete sections. The design of a typical example of the liquid retaining structure is illustrated. The results demonstrate extraordinarily converging speed as near-optimal solutions are acquired after merely exploration of a small portion of the search space. This system can act as a consultant to assist novice designers in the design of liquid retaining structures.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Introduction: Image resizing is a normal feature incorporated into the Nuclear Medicine digital imaging. Upsampling is done by manufacturers to adequately fit more the acquired images on the display screen and it is applied when there is a need to increase - or decrease - the total number of pixels. This paper pretends to compare the “hqnx” and the “nxSaI” magnification algorithms with two interpolation algorithms – “nearest neighbor” and “bicubic interpolation” – in the image upsampling operations. Material and Methods: Three distinct Nuclear Medicine images were enlarged 2 and 4 times with the different digital image resizing algorithms (nearest neighbor, bicubic interpolation nxSaI and hqnx). To evaluate the pixel’s changes between the different output images, 3D whole image plot profiles and surface plots were used as an addition to the visual approach in the 4x upsampled images. Results: In the 2x enlarged images the visual differences were not so noteworthy. Although, it was clearly noticed that bicubic interpolation presented the best results. In the 4x enlarged images the differences were significant, with the bicubic interpolated images presenting the best results. Hqnx resized images presented better quality than 4xSaI and nearest neighbor interpolated images, however, its intense “halo effect” affects greatly the definition and boundaries of the image contents. Conclusion: The hqnx and the nxSaI algorithms were designed for images with clear edges and so its use in Nuclear Medicine images is obviously inadequate. Bicubic interpolation seems, from the algorithms studied, the most suitable and its each day wider applications seem to show it, being assumed as a multi-image type efficient algorithm.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Este trabalho visa contribuir para o desenvolvimento de um sistema de visão multi-câmara para determinação da localização, atitude e seguimento de múltiplos objectos, para ser utilizado na unidade de robótica do INESCTEC, e resulta da necessidade de ter informação externa exacta que sirva de referência no estudo, caracterização e desenvolvimento de algoritmos de localização, navegação e controlo de vários sistemas autónomos. Com base na caracterização dos veículos autónomos existentes na unidade de robótica do INESCTEC e na análise dos seus cenários de operação, foi efectuado o levantamento de requisitos para o sistema a desenvolver. Foram estudados os fundamentos teóricos, necessários ao desenvolvimento do sistema, em temas relacionados com visão computacional, métodos de estimação e associação de dados para problemas de seguimento de múltiplos objectos . Foi proposta uma arquitectura para o sistema global que endereça os vários requisitos identi cados, permitindo a utilização de múltiplas câmaras e suportando o seguimento de múltiplos objectos, com ou sem marcadores. Foram implementados e validados componentes da arquitectura proposta e integrados num sistema para validação, focando na localização e seguimento de múltiplos objectos com marcadores luminosos à base de Light-Emitting Diodes (LEDs). Nomeadamente, os módulos para a identi cação dos pontos de interesse na imagem, técnicas para agrupar os vários pontos de interesse de cada objecto e efectuar a correspondência das medidas obtidas pelas várias câmaras, método para a determinação da posição e atitude dos objectos, ltro para seguimento de múltiplos objectos. Foram realizados testes para validação e a nação do sistema implementado que demonstram que a solução encontrada vai de encontro aos requisitos, e foram identi cadas as linhas de trabalho para a continuação do desenvolvimento do sistema global.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Os sistemas de perceção existentes nos robôs autónomos, hoje em dia, são bastante complexos. A informação dos vários sensores, existentes em diferentes partes do robôs, necessitam de estar relacionados entre si face ao referencial do robô ou do mundo. Para isso, o conhecimento da atitude (posição e rotação) entre os referenciais dos sensores e o referencial do robô é um fator critico para o desempenho do mesmo. O processo de calibração dessas posições e translações é chamado calibração dos parâmetros extrínsecos. Esta dissertação propõe o desenvolvimento de um método de calibração autónomo para robôs como câmaras direcionais, como é o caso dos robôs da equipa ISePorto. A solução proposta consiste na aquisição de dados da visão, giroscópio e odometria durante uma manobra efetuada pelo robô em torno de um alvo com um padrão conhecido. Esta informação é então processada em conjunto através de um Extended Kalman Filter (EKF) onde são estimados necessários para relacionar os sensores existentes no robô em relação ao referencial do mesmo. Esta solução foi avaliada com recurso a vários testes e os resultados obtidos foram bastante similares aos obtidos pelo método manual, anteriormente utilizado, com um aumento significativo em rapidez e consistência.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The underground scenarios are one of the most challenging environments for accurate and precise 3d mapping where hostile conditions like absence of Global Positioning Systems, extreme lighting variations and geometrically smooth surfaces may be expected. So far, the state-of-the-art methods in underground modelling remain restricted to environments in which pronounced geometric features are abundant. This limitation is a consequence of the scan matching algorithms used to solve the localization and registration problems. This paper contributes to the expansion of the modelling capabilities to structures characterized by uniform geometry and smooth surfaces, as is the case of road and train tunnels. To achieve that, we combine some state of the art techniques from mobile robotics, and propose a method for 6DOF platform positioning in such scenarios, that is latter used for the environment modelling. A visual monocular Simultaneous Localization and Mapping (MonoSLAM) approach based on the Extended Kalman Filter (EKF), complemented by the introduction of inertial measurements in the prediction step, allows our system to localize himself over long distances, using exclusively sensors carried on board a mobile platform. By feeding the Extended Kalman Filter with inertial data we were able to overcome the major problem related with MonoSLAM implementations, known as scale factor ambiguity. Despite extreme lighting variations, reliable visual features were extracted through the SIFT algorithm, and inserted directly in the EKF mechanism according to the Inverse Depth Parametrization. Through the 1-Point RANSAC (Random Sample Consensus) wrong frame-to-frame feature matches were rejected. The developed method was tested based on a dataset acquired inside a road tunnel and the navigation results compared with a ground truth obtained by post-processing a high grade Inertial Navigation System and L1/L2 RTK-GPS measurements acquired outside the tunnel. Results from the localization strategy are presented and analyzed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dissertação para obtenção do Grau de Mestre em Engenharia Biomédica

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We analyzed the coherence of electroencephalographic (EEG) signals recorded symmetrically from the two hemispheres, while subjects (n = 9) were viewing visual stimuli. Considering the many common features of the callosal connectivity in mammals, we expected that, as in our animal studies, interhemispheric coherence (ICoh) would increase only with bilateral iso-oriented gratings located close to the vertical meridian of the visual field, or extending across it. Indeed, a single grating that extended across the vertical meridian significantly increased the EEG ICoh in normal adult subjects. These ICoh responses were obtained from occipital and parietal derivations and were restricted to the gamma frequency band. They were detectable with different EEG references and were robust across and within subjects. Other unilateral and bilateral stimuli, including identical gratings that were effective in anesthetized animals, did not affect ICoh in humans. This fact suggests the existence of regulatory influences, possibly of a top-down kind, on the pattern of callosal activation in conscious human subjects. In addition to establishing the validity of EEG coherence analysis for assaying cortico-cortical connectivity, this study extends to the human brain the finding that visual stimuli cause interhemispheric synchronization, particularly in frequencies of the gamma band. It also indicates that the synchronization is carried out by cortico-cortical connection and suggests similarities in the organization of visual callosal connections in animals and in man.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

When underwater vehicles navigate close to the ocean floor, computer vision techniques can be applied to obtain motion estimates. A complete system to create visual mosaics of the seabed is described in this paper. Unfortunately, the accuracy of the constructed mosaic is difficult to evaluate. The use of a laboratory setup to obtain an accurate error measurement is proposed. The system consists on a robot arm carrying a downward looking camera. A pattern formed by a white background and a matrix of black dots uniformly distributed along the surveyed scene is used to find the exact image registration parameters. When the robot executes a trajectory (simulating the motion of a submersible), an image sequence is acquired by the camera. The estimated motion computed from the encoders of the robot is refined by detecting, to subpixel accuracy, the black dots of the image sequence, and computing the 2D projective transform which relates two consecutive images. The pattern is then substituted by a poster of the sea floor and the trajectory is executed again, acquiring the image sequence used to test the accuracy of the mosaicking system

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: In the context of population aging, visual impairment has emerged as a growing concern in public health. However, there is a need for further research into the relationship between visual impairment and chronic medical conditions in the elderly. The aim of our study was to examine the relationship between visual impairment and three main types of co-morbidity: chronic physical conditions (both at an independent and additive level), mental health and cognitive functioning. METHODS: Data were collected from the COURAGE in Europe project, a cross-sectional study. A total of 4,583 participants from Spain were included. Diagnosis of chronic medical conditions included self-reported medical diagnosis and symptomatic algorithms. Depression and anxiety were assessed using CIDI algorithms. Visual assessment included objective distance/near visual acuity and subjective visual performance. Descriptive analyses included the whole sample (n = 4,583). Statistical analyses included participants aged over 50 years (n = 3,625; mean age = 66.45 years) since they have a significant prevalence of chronic conditions and visual impairment. Crude and adjusted binary logistic regressions were performed to identify independent associations between visual impairment and chronic medical conditions, physical multimorbidity and mental conditions. Covariates included age, gender, marital status, education level, employment status and urbanicity. RESULTS: The number of chronic physical conditions was found to be associated with poorer results in both distance and near visual acuity [OR 1.75 (CI 1.38-2.23); OR 1.69 (CI 1.27-2.24)]. At an independent level, arthritis, stroke and diabetes were associated with poorer distance visual acuity results after adjusting for covariates [OR 1.79 (CI 1.46-2.21); OR 1.59 (CI 1.05-2.42); OR 1.27 (1.01-1.60)]. Only stroke was associated with near visual impairment [OR 3.01 (CI 1.86-4.87)]. With regard to mental health, poor subjective visual acuity was associated with depression [OR 1.61 (CI 1.14-2.27); OR 1.48 (CI 1.03-2.13)]. Both objective and subjective poor distance and near visual acuity were associated with worse cognitive functioning. CONCLUSIONS: Arthritis, stroke and the co-occurrence of various chronic physical diseases are associated with higher prevalence of visual impairment. Visual impairment is associated with higher prevalence of depression and poorer cognitive function results. There is a need to implement patient-centered care involving special visual assessment in these cases.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Identification of low-dimensional structures and main sources of variation from multivariate data are fundamental tasks in data analysis. Many methods aimed at these tasks involve solution of an optimization problem. Thus, the objective of this thesis is to develop computationally efficient and theoretically justified methods for solving such problems. Most of the thesis is based on a statistical model, where ridges of the density estimated from the data are considered as relevant features. Finding ridges, that are generalized maxima, necessitates development of advanced optimization methods. An efficient and convergent trust region Newton method for projecting a point onto a ridge of the underlying density is developed for this purpose. The method is utilized in a differential equation-based approach for tracing ridges and computing projection coordinates along them. The density estimation is done nonparametrically by using Gaussian kernels. This allows application of ridge-based methods with only mild assumptions on the underlying structure of the data. The statistical model and the ridge finding methods are adapted to two different applications. The first one is extraction of curvilinear structures from noisy data mixed with background clutter. The second one is a novel nonlinear generalization of principal component analysis (PCA) and its extension to time series data. The methods have a wide range of potential applications, where most of the earlier approaches are inadequate. Examples include identification of faults from seismic data and identification of filaments from cosmological data. Applicability of the nonlinear PCA to climate analysis and reconstruction of periodic patterns from noisy time series data are also demonstrated. Other contributions of the thesis include development of an efficient semidefinite optimization method for embedding graphs into the Euclidean space. The method produces structure-preserving embeddings that maximize interpoint distances. It is primarily developed for dimensionality reduction, but has also potential applications in graph theory and various areas of physics, chemistry and engineering. Asymptotic behaviour of ridges and maxima of Gaussian kernel densities is also investigated when the kernel bandwidth approaches infinity. The results are applied to the nonlinear PCA and to finding significant maxima of such densities, which is a typical problem in visual object tracking.