15 resultados para Monocular Vision.
em Universidad Politécnica de Madrid
Resumo:
En esta tesis se aborda la detección y el seguimiento automático de vehículos mediante técnicas de visión artificial con una cámara monocular embarcada. Este problema ha suscitado un gran interés por parte de la industria automovilística y de la comunidad científica ya que supone el primer paso en aras de la ayuda a la conducción, la prevención de accidentes y, en última instancia, la conducción automática. A pesar de que se le ha dedicado mucho esfuerzo en los últimos años, de momento no se ha encontrado ninguna solución completamente satisfactoria y por lo tanto continúa siendo un tema de investigación abierto. Los principales problemas que plantean la detección y seguimiento mediante visión artificial son la gran variabilidad entre vehículos, un fondo que cambia dinámicamente debido al movimiento de la cámara, y la necesidad de operar en tiempo real. En este contexto, esta tesis propone un marco unificado para la detección y seguimiento de vehículos que afronta los problemas descritos mediante un enfoque estadístico. El marco se compone de tres grandes bloques, i.e., generación de hipótesis, verificación de hipótesis, y seguimiento de vehículos, que se llevan a cabo de manera secuencial. No obstante, se potencia el intercambio de información entre los diferentes bloques con objeto de obtener el máximo grado posible de adaptación a cambios en el entorno y de reducir el coste computacional. Para abordar la primera tarea de generación de hipótesis, se proponen dos métodos complementarios basados respectivamente en el análisis de la apariencia y la geometría de la escena. Para ello resulta especialmente interesante el uso de un dominio transformado en el que se elimina la perspectiva de la imagen original, puesto que este dominio permite una búsqueda rápida dentro de la imagen y por tanto una generación eficiente de hipótesis de localización de los vehículos. Los candidatos finales se obtienen por medio de un marco colaborativo entre el dominio original y el dominio transformado. Para la verificación de hipótesis se adopta un método de aprendizaje supervisado. Así, se evalúan algunos de los métodos de extracción de características más populares y se proponen nuevos descriptores con arreglo al conocimiento de la apariencia de los vehículos. Para evaluar la efectividad en la tarea de clasificación de estos descriptores, y dado que no existen bases de datos públicas que se adapten al problema descrito, se ha generado una nueva base de datos sobre la que se han realizado pruebas masivas. Finalmente, se presenta una metodología para la fusión de los diferentes clasificadores y se plantea una discusión sobre las combinaciones que ofrecen los mejores resultados. El núcleo del marco propuesto está constituido por un método Bayesiano de seguimiento basado en filtros de partículas. Se plantean contribuciones en los tres elementos fundamentales de estos filtros: el algoritmo de inferencia, el modelo dinámico y el modelo de observación. En concreto, se propone el uso de un método de muestreo basado en MCMC que evita el elevado coste computacional de los filtros de partículas tradicionales y por consiguiente permite que el modelado conjunto de múltiples vehículos sea computacionalmente viable. Por otra parte, el dominio transformado mencionado anteriormente permite la definición de un modelo dinámico de velocidad constante ya que se preserva el movimiento suave de los vehículos en autopistas. Por último, se propone un modelo de observación que integra diferentes características. En particular, además de la apariencia de los vehículos, el modelo tiene en cuenta también toda la información recibida de los bloques de procesamiento previos. El método propuesto se ejecuta en tiempo real en un ordenador de propósito general y da unos resultados sobresalientes en comparación con los métodos tradicionales. ABSTRACT This thesis addresses on-road vehicle detection and tracking with a monocular vision system. This problem has attracted the attention of the automotive industry and the research community as it is the first step for driver assistance and collision avoidance systems and for eventual autonomous driving. Although many effort has been devoted to address it in recent years, no satisfactory solution has yet been devised and thus it is an active research issue. The main challenges for vision-based vehicle detection and tracking are the high variability among vehicles, the dynamically changing background due to camera motion and the real-time processing requirement. In this thesis, a unified approach using statistical methods is presented for vehicle detection and tracking that tackles these issues. The approach is divided into three primary tasks, i.e., vehicle hypothesis generation, hypothesis verification, and vehicle tracking, which are performed sequentially. Nevertheless, the exchange of information between processing blocks is fostered so that the maximum degree of adaptation to changes in the environment can be achieved and the computational cost is alleviated. Two complementary strategies are proposed to address the first task, i.e., hypothesis generation, based respectively on appearance and geometry analysis. To this end, the use of a rectified domain in which the perspective is removed from the original image is especially interesting, as it allows for fast image scanning and coarse hypothesis generation. The final vehicle candidates are produced using a collaborative framework between the original and the rectified domains. A supervised classification strategy is adopted for the verification of the hypothesized vehicle locations. In particular, state-of-the-art methods for feature extraction are evaluated and new descriptors are proposed by exploiting the knowledge on vehicle appearance. Due to the lack of appropriate public databases, a new database is generated and the classification performance of the descriptors is extensively tested on it. Finally, a methodology for the fusion of the different classifiers is presented and the best combinations are discussed. The core of the proposed approach is a Bayesian tracking framework using particle filters. Contributions are made on its three key elements: the inference algorithm, the dynamic model and the observation model. In particular, the use of a Markov chain Monte Carlo method is proposed for sampling, which circumvents the exponential complexity increase of traditional particle filters thus making joint multiple vehicle tracking affordable. On the other hand, the aforementioned rectified domain allows for the definition of a constant-velocity dynamic model since it preserves the smooth motion of vehicles in highways. Finally, a multiple-cue observation model is proposed that not only accounts for vehicle appearance but also integrates the available information from the analysis in the previous blocks. The proposed approach is proven to run near real-time in a general purpose PC and to deliver outstanding results compared to traditional methods.
Resumo:
This work aims to develop a novel Cross-Entropy (CE) optimization-based fuzzy controller for Unmanned Aerial Monocular Vision-IMU System (UAMVIS) to solve the seeand- avoid problem using its accurate autonomous localization information. The function of this fuzzy controller is regulating the heading of this system to avoid the obstacle, e.g. wall. In the Matlab Simulink-based training stages, the Scaling Factor (SF) is adjusted according to the specified task firstly, and then the Membership Function (MF) is tuned based on the optimized Scaling Factor to further improve the collison avoidance performance. After obtained the optimal SF and MF, 64% of rules has been reduced (from 125 rules to 45 rules), and a large number of real flight tests with a quadcopter have been done. The experimental results show that this approach precisely navigates the system to avoid the obstacle. To our best knowledge, this is the first work to present the optimized fuzzy controller for UAMVIS using Cross-Entropy method in Scaling Factors and Membership Functions optimization.
Resumo:
This work aims to develop a novel Cross-Entropy (CE) optimization-based fuzzy controller for Unmanned Aerial Monocular Vision-IMU System (UAMVIS) to solve the seeand-avoid problem using its accurate autonomous localization information. The function of this fuzzy controller is regulating the heading of this system to avoid the obstacle, e.g. wall. In the Matlab Simulink-based training stages, the Scaling Factor (SF) is adjusted according to the specified task firstly, and then the Membership Function (MF) is tuned based on the optimized Scaling Factor to further improve the collison avoidance performance. After obtained the optimal SF and MF, 64% of rules has been reduced (from 125 rules to 45 rules), and a large number of real flight tests with a quadcopter have been done. The experimental results show that this approach precisely navigates the system to avoid the obstacle. To our best knowledge, this is the first work to present the optimized fuzzy controller for UAMVIS using Cross-Entropy method in Scaling Factors and Membership Functions optimization.
Resumo:
Autonomous aerial refueling is a key enabling technology for both manned and unmanned aircraft where extended flight duration or range are required. The results presented within this paper offer one potential vision-based sensing solution, together with a unique test environment. A hierarchical visual tracking algorithm based on direct methods is proposed and developed for the purposes of tracking a drogue during the capture stage of autonomous aerial refueling, and of estimating its 3D position. Intended to be applied in real time to a video stream from a single monocular camera mounted on the receiver aircraft, the algorithm is shown to be highly robust, and capable of tracking large, rapid drogue motions within the frame of reference. The proposed strategy has been tested using a complex robotic testbed and with actual flight hardware consisting of a full size probe and drogue. Results show that the vision tracking algorithm can detect and track the drogue at real-time frame rates of more than thirty frames per second, obtaining a robust position estimation even with strong motions and multiple occlusions of the drogue.
Resumo:
In this paper we propose an innovative method for the automatic detection and tracking of road traffic signs using an onboard stereo camera. It involves a combination of monocular and stereo analysis strategies to increase the reliability of the detections such that it can boost the performance of any traffic sign recognition scheme. Firstly, an adaptive color and appearance based detection is applied at single camera level to generate a set of traffic sign hypotheses. In turn, stereo information allows for sparse 3D reconstruction of potential traffic signs through a SURF-based matching strategy. Namely, the plane that best fits the cloud of 3D points traced back from feature matches is estimated using a RANSAC based approach to improve robustness to outliers. Temporal consistency of the 3D information is ensured through a Kalman-based tracking stage. This also allows for the generation of a predicted 3D traffic sign model, which is in turn used to enhance the previously mentioned color-based detector through a feedback loop, thus improving detection accuracy. The proposed solution has been tested with real sequences under several illumination conditions and in both urban areas and highways, achieving very high detection rates in challenging environments, including rapid motion and significant perspective distortion
Resumo:
El principal objetivo de este trabajo es proporcionar una solución en tiempo real basada en visión estéreo o monocular precisa y robusta para que un vehículo aéreo no tripulado (UAV) sea autónomo en varios tipos de aplicaciones UAV, especialmente en entornos abarrotados sin señal GPS. Este trabajo principalmente consiste en tres temas de investigación de UAV basados en técnicas de visión por computador: (I) visual tracking, proporciona soluciones efectivas para localizar visualmente objetos de interés estáticos o en movimiento durante el tiempo que dura el vuelo del UAV mediante una aproximación adaptativa online y una estrategia de múltiple resolución, de este modo superamos los problemas generados por las diferentes situaciones desafiantes, tales como cambios significativos de aspecto, iluminación del entorno variante, fondo del tracking embarullado, oclusión parcial o total de objetos, variaciones rápidas de posición y vibraciones mecánicas a bordo. La solución ha sido utilizada en aterrizajes autónomos, inspección de plataformas mar adentro o tracking de aviones en pleno vuelo para su detección y evasión; (II) odometría visual: proporciona una solución eficiente al UAV para estimar la posición con 6 grados de libertad (6D) usando únicamente la entrada de una cámara estéreo a bordo del UAV. Un método Semi-Global Blocking Matching (SGBM) eficiente basado en una estrategia grueso-a-fino ha sido implementada para una rápida y profunda estimación del plano. Además, la solución toma provecho eficazmente de la información 2D y 3D para estimar la posición 6D, resolviendo de esta manera la limitación de un punto de referencia fijo en la cámara estéreo. Una robusta aproximación volumétrica de mapping basada en el framework Octomap ha sido utilizada para reconstruir entornos cerrados y al aire libre bastante abarrotados en 3D con memoria y errores correlacionados espacialmente o temporalmente; (III) visual control, ofrece soluciones de control prácticas para la navegación de un UAV usando Fuzzy Logic Controller (FLC) con la estimación visual. Y el framework de Cross-Entropy Optimization (CEO) ha sido usado para optimizar el factor de escala y la función de pertenencia en FLC. Todas las soluciones basadas en visión en este trabajo han sido probadas en test reales. Y los conjuntos de datos de imágenes reales grabados en estos test o disponibles para la comunidad pública han sido utilizados para evaluar el rendimiento de estas soluciones basadas en visión con ground truth. Además, las soluciones de visión presentadas han sido comparadas con algoritmos de visión del estado del arte. Los test reales y los resultados de evaluación muestran que las soluciones basadas en visión proporcionadas han obtenido rendimientos en tiempo real precisos y robustos, o han alcanzado un mejor rendimiento que aquellos algoritmos del estado del arte. La estimación basada en visión ha ganado un rol muy importante en controlar un UAV típico para alcanzar autonomía en aplicaciones UAV. ABSTRACT The main objective of this dissertation is providing real-time accurate robust monocular or stereo vision-based solution for Unmanned Aerial Vehicle (UAV) to achieve the autonomy in various types of UAV applications, especially in GPS-denied dynamic cluttered environments. This dissertation mainly consists of three UAV research topics based on computer vision technique: (I) visual tracking, it supplys effective solutions to visually locate interesting static or moving object over time during UAV flight with on-line adaptivity approach and multiple-resolution strategy, thereby overcoming the problems generated by the different challenging situations, such as significant appearance change, variant surrounding illumination, cluttered tracking background, partial or full object occlusion, rapid pose variation and onboard mechanical vibration. The solutions have been utilized in autonomous landing, offshore floating platform inspection and midair aircraft tracking for sense-and-avoid; (II) visual odometry: it provides the efficient solution for UAV to estimate the 6 Degree-of-freedom (6D) pose using only the input of stereo camera onboard UAV. An efficient Semi-Global Blocking Matching (SGBM) method based on a coarse-to-fine strategy has been implemented for fast depth map estimation. In addition, the solution effectively takes advantage of both 2D and 3D information to estimate the 6D pose, thereby solving the limitation of a fixed small baseline in the stereo camera. A robust volumetric occupancy mapping approach based on the Octomap framework has been utilized to reconstruct indoor and outdoor large-scale cluttered environments in 3D with less temporally or spatially correlated measurement errors and memory; (III) visual control, it offers practical control solutions to navigate UAV using Fuzzy Logic Controller (FLC) with the visual estimation. And the Cross-Entropy Optimization (CEO) framework has been used to optimize the scaling factor and the membership function in FLC. All the vision-based solutions in this dissertation have been tested in real tests. And the real image datasets recorded from these tests or available from public community have been utilized to evaluate the performance of these vision-based solutions with ground truth. Additionally, the presented vision solutions have compared with the state-of-art visual algorithms. Real tests and evaluation results show that the provided vision-based solutions have obtained real-time accurate robust performances, or gained better performance than those state-of-art visual algorithms. The vision-based estimation has played a critically important role for controlling a typical UAV to achieve autonomy in the UAV application.
Resumo:
The emerging use of real-time 3D-based multimedia applications imposes strict quality of service (QoS) requirements on both access and core networks. These requirements and their impact to provide end-to-end 3D videoconferencing services have been studied within the Spanish-funded VISION project, where different scenarios were implemented showing an agile stereoscopic video call that might be offered to the general public in the near future. In view of the requirements, we designed an integrated access and core converged network architecture which provides the requested QoS to end-to-end IP sessions. Novel functional blocks are proposed to control core optical networks, the functionality of the standard ones is redefined, and the signaling improved to better meet the requirements of future multimedia services. An experimental test-bed to assess the feasibility of the solution was also deployed. In such test-bed, set-up and release of end-to-end sessions meeting specific QoS requirements are shown and the impact of QoS degradation in terms of the user perceived quality degradation is quantified. In addition, scalability results show that the proposed signaling architecture is able to cope with large number of requests introducing almost negligible delay.
Resumo:
This paper outlines an automatic computervision system for the identification of avena sterilis which is a special weed seed growing in cereal crops. The final goal is to reduce the quantity of herbicide to be sprayed as an important and necessary step for precision agriculture. So, only areas where the presence of weeds is important should be sprayed. The main problems for the identification of this kind of weed are its similar spectral signature with respect the crops and also its irregular distribution in the field. It has been designed a new strategy involving two processes: image segmentation and decision making. The image segmentation combines basic suitable image processing techniques in order to extract cells from the image as the low level units. Each cell is described by two area-based attributes measuring the relations among the crops and weeds. The decision making is based on the SupportVectorMachines and determines if a cell must be sprayed. The main findings of this paper are reflected in the combination of the segmentation and the SupportVectorMachines decision processes. Another important contribution of this approach is the minimum requirements of the system in terms of memory and computation power if compared with other previous works. The performance of the method is illustrated by comparative analysis against some existing strategies.
Resumo:
In this paper, we propose a system for authenticating local bee pollen against fraudulent samples using image processing and classification techniques. Our system is based on the colour properties of bee pollen loads and the use of one-class classifiers to reject unknown pollen samples. The latter classification techniques allow us to tackle the major difficulty of the problem, the existence of many possible fraudulent pollen types. Also presented is a multi-classifier model with an ambiguity discovery process to fuse the output of the one-class classifiers. The method is validated by authenticating Spanish bee pollen types, the overall accuracy of the final system of being 94%. Therefore, the system is able to rapidly reject the non-local pollen samples with inexpensive hardware and without the need to send the product to the laboratory.
Resumo:
There is clear evidence that investment in intelligent transportation system technologies brings major social and economic benefits. Technological advances in the area of automatic systems in particular are becoming vital for the reduction of road deaths. We here describe our approach to automation of one the riskiest autonomous manœuvres involving vehicles – overtaking. The approach is based on a stereo vision system responsible for detecting any preceding vehicle and triggering the autonomous overtaking manœuvre. To this end, a fuzzy-logic based controller was developed to emulate how humans overtake. Its input is information from the vision system and from a positioning-based system consisting of a differential global positioning system (DGPS) and an inertial measurement unit (IMU). Its output is the generation of action on the vehicle’s actuators, i.e., the steering wheel and throttle and brake pedals. The system has been incorporated into a commercial Citroën car and tested on the private driving circuit at the facilities of our research center, CAR, with different preceding vehicles – a motorbike, car, and truck – with encouraging results.
Resumo:
ntelligent systems designed to reduce highway fatalities have been widely applied in the automotive sector in the last decade. Of all users of transport systems, pedestrians are the most vulnerable in crashes as they are unprotected. This paper deals with an autonomous intelligent emergency system designed to avoid collisions with pedestrians. The system consists of a fuzzy controller based on the time-to-collision estimate – obtained via a vision-based system – and the wheel-locking probability – obtained via the vehicle’s CAN bus – that generates a safe braking action. The system has been tested in a real car – a convertible Citroën C3 Pluriel – equipped with an automated electro-hydraulic braking system capable of working in parallel with the vehicle’s original braking circuit. The system is used as a last resort in the case that an unexpected pedestrian is in the lane and all the warnings have failed to produce a response from the driver.
Resumo:
A model of the mammalian retina and the behavior of the first layers in the visual cortex is reported. The building blocks are optically programmable logic cells. A model of the retina, similar to the one reported by Dowling (1987) is presented. From the model of the visual cortex obtained, some types of symmetries and asymmetries are possible to be detected
Resumo:
In recent years, remote sensing imaging systems for the measurement of oceanic sea states have attracted renovated attention. Imaging technology is economical, non-invasive and enables a better understanding of the space-time dynamics of ocean waves over an area rather than at selected point locations of previous monitoring methods (buoys, wave gauges, etc.). We present recent progress in space-time measurement of ocean waves using stereo vision systems on offshore platforms, which focus on sea states with wavelengths in the range of 0.01 m to 10 m. Classical epipolar techniques and modern variational methods are reviewed to reconstruct the sea surface from the stereo pairs sequentially in time. The statistical and spectral properties of the resulting observed waves are analyzed. Current improvements of the variational methods are discussed as future lines of research.
Resumo:
One of the most challenging problems that must be solved by any theoretical model purporting to explain the competence of the human brain for relational tasks is the one related with the analysis and representation of the internal structure in an extended spatial layout of múltiple objects. In this way, some of the problems are related with specific aims as how can we extract and represent spatial relationships among objects, how can we represent the movement of a selected object and so on. The main objective of this paper is the study of some plausible brain structures that can provide answers in these problems. Moreover, in order to achieve a more concrete knowledge, our study will be focused on the response of the retinal layers for optical information processing and how this information can be processed in the first cortex layers. The model to be reported is just a first trial and some major additions are needed to complete the whole vision process.
Resumo:
In recent years, remote sensing imaging systems for the measurement of oceanic sea states have attracted renovated attention. Imaging technology is economical, non-invasive and enables a better understanding of the space-time dynamics of ocean waves over an area rather than at selected point locations of previous monitoring methods (buoys, wave gauges, etc.). We present recent progress in space-time measurement of ocean waves using stereo vision systems on offshore platforms, which focus on sea states with wavelengths in the range of 0.01 m to 1 m. Both traditional disparity-based systems and modern elevation-based ones are presented in a variational optimization framework: the main idea is to pose the stereoscopic reconstruction problem of the surface of the ocean in a variational setting and design an energy functional whose minimizer is the desired temporal sequence of wave heights. The functional combines photometric observations as well as spatial and temporal smoothness priors. Disparity methods estimate the disparity between images as an intermediate step toward retrieving the depth of the waves with respect to the cameras, whereas elevation methods estimate the ocean surface displacements directly in 3-D space. Both techniques are used to measure ocean waves from real data collected at offshore platforms in the Black Sea (Crimean Peninsula, Ukraine) and the Northern Adriatic Sea (Venice coast, Italy). Then, the statistical and spectral properties of the resulting observed waves are analyzed. We show the advantages and disadvantages of the presented stereo vision systems and discuss future lines of research to improve their performance in critical issues such as the robustness of the camera calibration in spite of undesired variations of the camera parameters or the processing time that it takes to retrieve ocean wave measurements from the stereo videos, which are very large datasets that need to be processed efficiently to be of practical usage. Multiresolution and short-time approaches would improve efficiency and scalability of the techniques so that wave displacements are obtained in feasible times.