932 resultados para Vision-based
Resumo:
This paper describes the improvements achieved in our mosaicking system to assist unmanned underwater vehicle navigation. A major advance has been attained in the processing of images of the ocean floor when light absorption effects are evident. Due to the absorption of natural light, underwater vehicles often require artificial light sources attached to them to provide the adequate illumination for processing underwater images. Unfortunately, these flashlights tend to illuminate the scene in a nonuniform fashion. In this paper a technique to correct non-uniform lighting is proposed. The acquired frames are compensated through a point-by-point division of the image by an estimation of the illumination field. Then, the gray-levels of the obtained image remapped to enhance image contrast. Experiments with real images are presented
Resumo:
This paper deals with the problem of navigation for an unmanned underwater vehicle (UUV) through image mosaicking. It represents a first step towards a real-time vision-based navigation system for a small-class low-cost UUV. We propose a navigation system composed by: (i) an image mosaicking module which provides velocity estimates; and (ii) an extended Kalman filter based on the hydrodynamic equation of motion, previously identified for this particular UUV. The obtained system is able to estimate the position and velocity of the robot. Moreover, it is able to deal with visual occlusions that usually appear when the sea bottom does not have enough visual features to solve the correspondence problem in a certain area of the trajectory
Resumo:
[EN]This paper describes a low-cost system that allows the user to visualize different glasses models in live video. The user can also move the glasses to adjust its position on the face. The system, which runs at 9.5 frames/s on general-purpose hardware, has a homeostatic module that keeps image parameters controlled. This is achieved by using a camera with motorized zoom, iris, white balance, etc. This feature can be specially useful in environments with changing illumination and shadows, like in an optical shop. The system also includes a face and eye detection module and a glasses management module.
Resumo:
Background: Individuals with type 1 diabetes (T1D) have to count the carbohydrates (CHOs) of their meal to estimate the prandial insulin dose needed to compensate for the meal’s effect on blood glucose levels. CHO counting is very challenging but also crucial, since an error of 20 grams can substantially impair postprandial control. Method: The GoCARB system is a smartphone application designed to support T1D patients with CHO counting of nonpacked foods. In a typical scenario, the user places a reference card next to the dish and acquires 2 images with his/her smartphone. From these images, the plate is detected and the different food items on the plate are automatically segmented and recognized, while their 3D shape is reconstructed. Finally, the food volumes are calculated and the CHO content is estimated by combining the previous results and using the USDA nutritional database. Results: To evaluate the proposed system, a set of 24 multi-food dishes was used. For each dish, 3 pairs of images were taken and for each pair, the system was applied 4 times. The mean absolute percentage error in CHO estimation was 10 ± 12%, which led to a mean absolute error of 6 ± 8 CHO grams for normal-sized dishes. Conclusion: The laboratory experiments demonstrated the feasibility of the GoCARB prototype system since the error was below the initial goal of 20 grams. However, further improvements and evaluation are needed prior launching a system able to meet the inter- and intracultural eating habits.
Resumo:
In retinal surgery, surgeons face difficulties such as indirect visualization of surgical targets, physiological tremor, and lack of tactile feedback, which increase the risk of retinal damage caused by incorrect surgical gestures. In this context, intraocular proximity sensing has the potential to overcome current technical limitations and increase surgical safety. In this paper, we present a system for detecting unintentional collisions between surgical tools and the retina using the visual feedback provided by the opthalmic stereo microscope. Using stereo images, proximity between surgical tools and the retinal surface can be detected when their relative stereo disparity is small. For this purpose, we developed a system comprised of two modules. The first is a module for tracking the surgical tool position on both stereo images. The second is a disparity tracking module for estimating a stereo disparity map of the retinal surface. Both modules were specially tailored for coping with the challenging visualization conditions in retinal surgery. The potential clinical value of the proposed method is demonstrated by extensive testing using a silicon phantom eye and recorded rabbit in vivo data.
Resumo:
En esta tesis se aborda la detección y el seguimiento automático de vehículos mediante técnicas de visión artificial con una cámara monocular embarcada. Este problema ha suscitado un gran interés por parte de la industria automovilística y de la comunidad científica ya que supone el primer paso en aras de la ayuda a la conducción, la prevención de accidentes y, en última instancia, la conducción automática. A pesar de que se le ha dedicado mucho esfuerzo en los últimos años, de momento no se ha encontrado ninguna solución completamente satisfactoria y por lo tanto continúa siendo un tema de investigación abierto. Los principales problemas que plantean la detección y seguimiento mediante visión artificial son la gran variabilidad entre vehículos, un fondo que cambia dinámicamente debido al movimiento de la cámara, y la necesidad de operar en tiempo real. En este contexto, esta tesis propone un marco unificado para la detección y seguimiento de vehículos que afronta los problemas descritos mediante un enfoque estadístico. El marco se compone de tres grandes bloques, i.e., generación de hipótesis, verificación de hipótesis, y seguimiento de vehículos, que se llevan a cabo de manera secuencial. No obstante, se potencia el intercambio de información entre los diferentes bloques con objeto de obtener el máximo grado posible de adaptación a cambios en el entorno y de reducir el coste computacional. Para abordar la primera tarea de generación de hipótesis, se proponen dos métodos complementarios basados respectivamente en el análisis de la apariencia y la geometría de la escena. Para ello resulta especialmente interesante el uso de un dominio transformado en el que se elimina la perspectiva de la imagen original, puesto que este dominio permite una búsqueda rápida dentro de la imagen y por tanto una generación eficiente de hipótesis de localización de los vehículos. Los candidatos finales se obtienen por medio de un marco colaborativo entre el dominio original y el dominio transformado. Para la verificación de hipótesis se adopta un método de aprendizaje supervisado. Así, se evalúan algunos de los métodos de extracción de características más populares y se proponen nuevos descriptores con arreglo al conocimiento de la apariencia de los vehículos. Para evaluar la efectividad en la tarea de clasificación de estos descriptores, y dado que no existen bases de datos públicas que se adapten al problema descrito, se ha generado una nueva base de datos sobre la que se han realizado pruebas masivas. Finalmente, se presenta una metodología para la fusión de los diferentes clasificadores y se plantea una discusión sobre las combinaciones que ofrecen los mejores resultados. El núcleo del marco propuesto está constituido por un método Bayesiano de seguimiento basado en filtros de partículas. Se plantean contribuciones en los tres elementos fundamentales de estos filtros: el algoritmo de inferencia, el modelo dinámico y el modelo de observación. En concreto, se propone el uso de un método de muestreo basado en MCMC que evita el elevado coste computacional de los filtros de partículas tradicionales y por consiguiente permite que el modelado conjunto de múltiples vehículos sea computacionalmente viable. Por otra parte, el dominio transformado mencionado anteriormente permite la definición de un modelo dinámico de velocidad constante ya que se preserva el movimiento suave de los vehículos en autopistas. Por último, se propone un modelo de observación que integra diferentes características. En particular, además de la apariencia de los vehículos, el modelo tiene en cuenta también toda la información recibida de los bloques de procesamiento previos. El método propuesto se ejecuta en tiempo real en un ordenador de propósito general y da unos resultados sobresalientes en comparación con los métodos tradicionales. ABSTRACT This thesis addresses on-road vehicle detection and tracking with a monocular vision system. This problem has attracted the attention of the automotive industry and the research community as it is the first step for driver assistance and collision avoidance systems and for eventual autonomous driving. Although many effort has been devoted to address it in recent years, no satisfactory solution has yet been devised and thus it is an active research issue. The main challenges for vision-based vehicle detection and tracking are the high variability among vehicles, the dynamically changing background due to camera motion and the real-time processing requirement. In this thesis, a unified approach using statistical methods is presented for vehicle detection and tracking that tackles these issues. The approach is divided into three primary tasks, i.e., vehicle hypothesis generation, hypothesis verification, and vehicle tracking, which are performed sequentially. Nevertheless, the exchange of information between processing blocks is fostered so that the maximum degree of adaptation to changes in the environment can be achieved and the computational cost is alleviated. Two complementary strategies are proposed to address the first task, i.e., hypothesis generation, based respectively on appearance and geometry analysis. To this end, the use of a rectified domain in which the perspective is removed from the original image is especially interesting, as it allows for fast image scanning and coarse hypothesis generation. The final vehicle candidates are produced using a collaborative framework between the original and the rectified domains. A supervised classification strategy is adopted for the verification of the hypothesized vehicle locations. In particular, state-of-the-art methods for feature extraction are evaluated and new descriptors are proposed by exploiting the knowledge on vehicle appearance. Due to the lack of appropriate public databases, a new database is generated and the classification performance of the descriptors is extensively tested on it. Finally, a methodology for the fusion of the different classifiers is presented and the best combinations are discussed. The core of the proposed approach is a Bayesian tracking framework using particle filters. Contributions are made on its three key elements: the inference algorithm, the dynamic model and the observation model. In particular, the use of a Markov chain Monte Carlo method is proposed for sampling, which circumvents the exponential complexity increase of traditional particle filters thus making joint multiple vehicle tracking affordable. On the other hand, the aforementioned rectified domain allows for the definition of a constant-velocity dynamic model since it preserves the smooth motion of vehicles in highways. Finally, a multiple-cue observation model is proposed that not only accounts for vehicle appearance but also integrates the available information from the analysis in the previous blocks. The proposed approach is proven to run near real-time in a general purpose PC and to deliver outstanding results compared to traditional methods.
Resumo:
ntelligent systems designed to reduce highway fatalities have been widely applied in the automotive sector in the last decade. Of all users of transport systems, pedestrians are the most vulnerable in crashes as they are unprotected. This paper deals with an autonomous intelligent emergency system designed to avoid collisions with pedestrians. The system consists of a fuzzy controller based on the time-to-collision estimate – obtained via a vision-based system – and the wheel-locking probability – obtained via the vehicle’s CAN bus – that generates a safe braking action. The system has been tested in a real car – a convertible Citroën C3 Pluriel – equipped with an automated electro-hydraulic braking system capable of working in parallel with the vehicle’s original braking circuit. The system is used as a last resort in the case that an unexpected pedestrian is in the lane and all the warnings have failed to produce a response from the driver.
Resumo:
Autonomous aerial refueling is a key enabling technology for both manned and unmanned aircraft where extended flight duration or range are required. The results presented within this paper offer one potential vision-based sensing solution, together with a unique test environment. A hierarchical visual tracking algorithm based on direct methods is proposed and developed for the purposes of tracking a drogue during the capture stage of autonomous aerial refueling, and of estimating its 3D position. Intended to be applied in real time to a video stream from a single monocular camera mounted on the receiver aircraft, the algorithm is shown to be highly robust, and capable of tracking large, rapid drogue motions within the frame of reference. The proposed strategy has been tested using a complex robotic testbed and with actual flight hardware consisting of a full size probe and drogue. Results show that the vision tracking algorithm can detect and track the drogue at real-time frame rates of more than thirty frames per second, obtaining a robust position estimation even with strong motions and multiple occlusions of the drogue.
Resumo:
In this paper, we apply a hierarchical tracking strategy of planar objects (or that can be assumed to be planar) that is based on direct methods for vision-based applications on-board UAVs. The use of this tracking strategy allows to achieve the tasks at real-time frame rates and to overcome problems posed by the challenging conditions of the tasks: e.g. constant vibrations, fast 3D changes, or limited capacity on-board. The vast majority of approaches make use of feature-based methods to track objects. Nonetheless, in this paper we show that although some of these feature-based solutions are faster, direct methods can be more robust under fast 3D motions (fast changes in position), some changes in appearance, constant vibrations (without requiring any specific hardware or software for video stabilization), and situations in which part of the object to track is outside of the field of view of the camera. The performance of the proposed tracking strategy on-board UAVs is evaluated with images from realflight tests using manually-generated ground truth information, accurate position estimation using a Vicon system, and also with simulated data from a simulation environment. Results show that the hierarchical tracking strategy performs better than wellknown feature-based algorithms and well-known configurations of direct methods, and that its performance is robust enough for vision-in-the-loop tasks, e.g. for vision-based landing tasks.
Resumo:
The IARC competitions aim at making the state of the art in UAV progress. The 2014 challenge deals mainly with GPS/Laser denied navigation, Robot-Robot interaction and Obstacle avoidance in the setting of a ground robot herding problem. We present in this paper a drone which will take part in this competition. The platform and hardware it is composed of and the software we designed are introduced. This software has three main components: the visual information acquisition, the mapping algorithm and the Aritificial Intelligence mission planner. A statement of the safety measures integrated in the drone and of our efforts to ensure field testing in conditions as close as possible to the challenge?s is also included.
Resumo:
Aircraft tracking plays a key and important role in the Sense-and-Avoid system of Unmanned Aerial Vehicles (UAVs). This paper presents a novel robust visual tracking algorithm for UAVs in the midair to track an arbitrary aircraft at real-time frame rates, together with a unique evaluation system. This visual algorithm mainly consists of adaptive discriminative visual tracking method, Multiple-Instance (MI) learning approach, Multiple-Classifier (MC) voting mechanism and Multiple-Resolution (MR) representation strategy, that is called Adaptive M3 tracker, i.e. AM3. In this tracker, the importance of test sample has been integrated to improve the tracking stability, accuracy and real-time performances. The experimental results show that this algorithm is more robust, efficient and accurate against the existing state-of-art trackers, overcoming the problems generated by the challenging situations such as obvious appearance change, variant surrounding illumination, partial aircraft occlusion, blur motion, rapid pose variation and onboard mechanical vibration, low computation capacity and delayed information communication between UAVs and Ground Station (GS). To our best knowledge, this is the first work to present this tracker for solving online learning and tracking freewill aircraft/intruder in the UAVs.
Resumo:
The importance of vision-based systems for Sense-and-Avoid is increasing nowadays as remotely piloted and autonomous UAVs become part of the non-segregated airspace. The development and evaluation of these systems demand flight scenario images which are expensive and risky to obtain. Currently Augmented Reality techniques allow the compositing of real flight scenario images with 3D aircraft models to produce useful realistic images for system development and benchmarking purposes at a much lower cost and risk. With the techniques presented in this paper, 3D aircraft models are positioned firstly in a simulated 3D scene with controlled illumination and rendering parameters. Realistic simulated images are then obtained using an image processing algorithm which fuses the images obtained from the 3D scene with images from real UAV flights taking into account on board camera vibrations. Since the intruder and camera poses are user-defined, ground truth data is available. These ground truth annotations allow to develop and quantitatively evaluate aircraft detection and tracking algorithms. This paper presents the software developed to create a public dataset of 24 videos together with their annotations and some tracking application results.
Resumo:
El principal objetivo de este trabajo es proporcionar una solución en tiempo real basada en visión estéreo o monocular precisa y robusta para que un vehículo aéreo no tripulado (UAV) sea autónomo en varios tipos de aplicaciones UAV, especialmente en entornos abarrotados sin señal GPS. Este trabajo principalmente consiste en tres temas de investigación de UAV basados en técnicas de visión por computador: (I) visual tracking, proporciona soluciones efectivas para localizar visualmente objetos de interés estáticos o en movimiento durante el tiempo que dura el vuelo del UAV mediante una aproximación adaptativa online y una estrategia de múltiple resolución, de este modo superamos los problemas generados por las diferentes situaciones desafiantes, tales como cambios significativos de aspecto, iluminación del entorno variante, fondo del tracking embarullado, oclusión parcial o total de objetos, variaciones rápidas de posición y vibraciones mecánicas a bordo. La solución ha sido utilizada en aterrizajes autónomos, inspección de plataformas mar adentro o tracking de aviones en pleno vuelo para su detección y evasión; (II) odometría visual: proporciona una solución eficiente al UAV para estimar la posición con 6 grados de libertad (6D) usando únicamente la entrada de una cámara estéreo a bordo del UAV. Un método Semi-Global Blocking Matching (SGBM) eficiente basado en una estrategia grueso-a-fino ha sido implementada para una rápida y profunda estimación del plano. Además, la solución toma provecho eficazmente de la información 2D y 3D para estimar la posición 6D, resolviendo de esta manera la limitación de un punto de referencia fijo en la cámara estéreo. Una robusta aproximación volumétrica de mapping basada en el framework Octomap ha sido utilizada para reconstruir entornos cerrados y al aire libre bastante abarrotados en 3D con memoria y errores correlacionados espacialmente o temporalmente; (III) visual control, ofrece soluciones de control prácticas para la navegación de un UAV usando Fuzzy Logic Controller (FLC) con la estimación visual. Y el framework de Cross-Entropy Optimization (CEO) ha sido usado para optimizar el factor de escala y la función de pertenencia en FLC. Todas las soluciones basadas en visión en este trabajo han sido probadas en test reales. Y los conjuntos de datos de imágenes reales grabados en estos test o disponibles para la comunidad pública han sido utilizados para evaluar el rendimiento de estas soluciones basadas en visión con ground truth. Además, las soluciones de visión presentadas han sido comparadas con algoritmos de visión del estado del arte. Los test reales y los resultados de evaluación muestran que las soluciones basadas en visión proporcionadas han obtenido rendimientos en tiempo real precisos y robustos, o han alcanzado un mejor rendimiento que aquellos algoritmos del estado del arte. La estimación basada en visión ha ganado un rol muy importante en controlar un UAV típico para alcanzar autonomía en aplicaciones UAV. ABSTRACT The main objective of this dissertation is providing real-time accurate robust monocular or stereo vision-based solution for Unmanned Aerial Vehicle (UAV) to achieve the autonomy in various types of UAV applications, especially in GPS-denied dynamic cluttered environments. This dissertation mainly consists of three UAV research topics based on computer vision technique: (I) visual tracking, it supplys effective solutions to visually locate interesting static or moving object over time during UAV flight with on-line adaptivity approach and multiple-resolution strategy, thereby overcoming the problems generated by the different challenging situations, such as significant appearance change, variant surrounding illumination, cluttered tracking background, partial or full object occlusion, rapid pose variation and onboard mechanical vibration. The solutions have been utilized in autonomous landing, offshore floating platform inspection and midair aircraft tracking for sense-and-avoid; (II) visual odometry: it provides the efficient solution for UAV to estimate the 6 Degree-of-freedom (6D) pose using only the input of stereo camera onboard UAV. An efficient Semi-Global Blocking Matching (SGBM) method based on a coarse-to-fine strategy has been implemented for fast depth map estimation. In addition, the solution effectively takes advantage of both 2D and 3D information to estimate the 6D pose, thereby solving the limitation of a fixed small baseline in the stereo camera. A robust volumetric occupancy mapping approach based on the Octomap framework has been utilized to reconstruct indoor and outdoor large-scale cluttered environments in 3D with less temporally or spatially correlated measurement errors and memory; (III) visual control, it offers practical control solutions to navigate UAV using Fuzzy Logic Controller (FLC) with the visual estimation. And the Cross-Entropy Optimization (CEO) framework has been used to optimize the scaling factor and the membership function in FLC. All the vision-based solutions in this dissertation have been tested in real tests. And the real image datasets recorded from these tests or available from public community have been utilized to evaluate the performance of these vision-based solutions with ground truth. Additionally, the presented vision solutions have compared with the state-of-art visual algorithms. Real tests and evaluation results show that the provided vision-based solutions have obtained real-time accurate robust performances, or gained better performance than those state-of-art visual algorithms. The vision-based estimation has played a critically important role for controlling a typical UAV to achieve autonomy in the UAV application.
Resumo:
This paper presents a completely autonomous solution to participate in the Indoor Challenge of the 2013 International Micro Air Vehicle Competition (IMAV 2013). Our proposal is a multi-robot system with no centralized coordination whose robotic agents share their position estimates. The capability of each agent to navigate avoiding collisions is a consequence of the resulting emergent behavior. Each agent consists of a ground station running an instance of the proposed architecture that communicates over WiFi with an AR Drone 2.0 quadrotor. Visual markers are employed to sense and map obstacles and to improve the pose estimation based on Inertial Measurement Unit (IMU) and ground optical flow data. Based on our architecture, each robotic agent can navigate avoiding obstacles and other members of the multi-robot system. The solution is demonstrated and the achieved navigation performance is evaluated by means of experimental flights. This work also analyzes the capabilities of the presented solution in simulated flights of the IMAV 2013 Indoor Challenge. The performance of the CVG UPM team was awarded with the First Prize in the Indoor Autonomy Challenge of the IMAV 2013 competition.