926 resultados para Vision-Based Forced Landing


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a completely autonomous solution to participate in the Indoor Challenge of the 2013 International Micro Air Vehicle Competition (IMAV 2013). Our proposal is a multi-robot system with no centralized coordination whose robotic agents share their position estimates. The capability of each agent to navigate avoiding collisions is a consequence of the resulting emergent behavior. Each agent consists of a ground station running an instance of the proposed architecture that communicates over WiFi with an AR Drone 2.0 quadrotor. Visual markers are employed to sense and map obstacles and to improve the pose estimation based on Inertial Measurement Unit (IMU) and ground optical flow data. Based on our architecture, each robotic agent can navigate avoiding obstacles and other members of the multi-robot system. The solution is demonstrated and the achieved navigation performance is evaluated by means of experimental flights. This work also analyzes the capabilities of the presented solution in simulated flights of the IMAV 2013 Indoor Challenge. The performance of the CVG UPM team was awarded with the First Prize in the Indoor Autonomy Challenge of the IMAV 2013 competition.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Atualmente os sistemas de pilotagem autónoma de quadricópteros estão a ser desenvolvidos de forma a efetuarem navegação em espaços exteriores, onde o sinal de GPS pode ser utilizado para definir waypoints de navegação, modos de position e altitude hold, returning home, entre outros. Contudo, o problema de navegação autónoma em espaços fechados sem que se utilize um sistema de posicionamento global dentro de uma sala, subsiste como um problema desafiante e sem solução fechada. Grande parte das soluções são baseadas em sensores dispendiosos, como o LIDAR ou como sistemas de posicionamento externos (p.ex. Vicon, Optitrack). Algumas destas soluções reservam a capacidade de processamento de dados dos sensores e dos algoritmos mais exigentes para sistemas de computação exteriores ao veículo, o que também retira a componente de autonomia total que se pretende num veículo com estas características. O objetivo desta tese pretende, assim, a preparação de um sistema aéreo não-tripulado de pequeno porte, nomeadamente um quadricóptero, que integre diferentes módulos que lhe permitam simultânea localização e mapeamento em espaços interiores onde o sinal GPS ´e negado, utilizando, para tal, uma câmara RGB-D, em conjunto com outros sensores internos e externos do quadricóptero, integrados num sistema que processa o posicionamento baseado em visão e com o qual se pretende que efectue, num futuro próximo, planeamento de movimento para navegação. O resultado deste trabalho foi uma arquitetura integrada para análise de módulos de localização, mapeamento e navegação, baseada em hardware aberto e barato e frameworks state-of-the-art disponíveis em código aberto. Foi também possível testar parcialmente alguns módulos de localização, sob certas condições de ensaio e certos parâmetros dos algoritmos. A capacidade de mapeamento da framework também foi testada e aprovada. A framework obtida encontra-se pronta para navegação, necessitando apenas de alguns ajustes e testes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

[EN]Vision-based applications designed for humanmachine interaction require fast and accurate hand detection. However, previous works on this field assume different constraints, like a limitation in the number of detected gestures, because hands are highly complex objects to locate. This paper presents an approach which changes the detection target without limiting the number of detected gestures. Using a cascade classifier we detect hands based on their wrists. With this approach, we introduce two main contributions: (1) a reliable segmentation, independently of the gesture being made and (2) a training phase faster than previous cascade classifier based methods. The paper includes experimental evaluations with different video streams that illustrate the efficiency and suitability for perceptual interfaces.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With the world of professional sports shifting towards employing better sport analytics, the demand for vision-based performance analysis is growing increasingly in recent years. In addition, the nature of many sports does not allow the use of any kind of sensors or other wearable markers attached to players for monitoring their performances during competitions. This provides a potential application of systematic observations such as tracking information of the players to help coaches to develop their visual skills and perceptual awareness needed to make decisions about team strategy or training plans. My PhD project is part of a bigger ongoing project between sport scientists and computer scientists involving also industry partners and sports organisations. The overall idea is to investigate the contribution technology can make to the analysis of sports performance on the example of team sports such as rugby, football or hockey. A particular focus is on vision-based tracking, so that information about the location and dynamics of the players can be gained without any additional sensors on the players. To start with, prior approaches on visual tracking are extensively reviewed and analysed. In this thesis, methods to deal with the difficulties in visual tracking to handle the target appearance changes caused by intrinsic (e.g. pose variation) and extrinsic factors, such as occlusion, are proposed. This analysis highlights the importance of the proposed visual tracking algorithms, which reflect these challenges and suggest robust and accurate frameworks to estimate the target state in a complex tracking scenario such as a sports scene, thereby facilitating the tracking process. Next, a framework for continuously tracking multiple targets is proposed. Compared to single target tracking, multi-target tracking such as tracking the players on a sports field, poses additional difficulties, namely data association, which needs to be addressed. Here, the aim is to locate all targets of interest, inferring their trajectories and deciding which observation corresponds to which target trajectory is. In this thesis, an efficient framework is proposed to handle this particular problem, especially in sport scenes, where the players of the same team tend to look similar and exhibit complex interactions and unpredictable movements resulting in matching ambiguity between the players. The presented approach is also evaluated on different sports datasets and shows promising results. Finally, information from the proposed tracking system is utilised as the basic input for further higher level performance analysis such as tactics and team formations, which can help coaches to design a better training plan. Due to the continuous nature of many team sports (e.g. soccer, hockey), it is not straightforward to infer the high-level team behaviours, such as players’ interaction. The proposed framework relies on two distinct levels of performance analysis: low-level performance analysis, such as identifying players positions on the play field, as well as a high-level analysis, where the aim is to estimate the density of player locations or detecting their possible interaction group. The related experiments show the proposed approach can effectively explore this high-level information, which has many potential applications.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The structure of an animal’s eye is determined by the tasks it must perform. While vertebrates rely on their two eyes for all visual functions, insects have evolved a wide range of specialized visual organs to support behaviors such as prey capture, predator evasion, mate pursuit, flight stabilization, and navigation. Compound eyes and ocelli constitute the vision forming and sensing mechanisms of some flying insects. They provide signals useful for flight stabilization and navigation. In contrast to the well-studied compound eye, the ocelli, seen as the second visual system, sense fast luminance changes and allows for fast visual processing. Using a luminance-based sensor that mimics the insect ocelli and a camera-based motion detection system, a frequency-domain characterization of an ocellar sensor and optic flow (due to rotational motion) are analyzed. Inspired by the insect neurons that make use of signals from both vision sensing mechanisms, advantages, disadvantages and complementary properties of ocellar and optic flow estimates are discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Industrial robots are both versatile and high performant, enabling the flexible automation typical of the modern Smart Factories. For safety reasons, however, they must be relegated inside closed fences and/or virtual safety barriers, to keep them strictly separated from human operators. This can be a limitation in some scenarios in which it is useful to combine the human cognitive skill with the accuracy and repeatability of a robot, or simply to allow a safe coexistence in a shared workspace. Collaborative robots (cobots), on the other hand, are intrinsically limited in speed and power in order to share workspace and tasks with human operators, and feature the very intuitive hand guiding programming method. Cobots, however, cannot compete with industrial robots in terms of performance, and are thus useful only in a limited niche, where they can actually bring an improvement in productivity and/or in the quality of the work thanks to their synergy with human operators. The limitations of both the pure industrial and the collaborative paradigms can be overcome by combining industrial robots with artificial vision. In particular, vision can be exploited for a real-time adjustment of the pre-programmed task-based robot trajectory, by means of the visual tracking of dynamic obstacles (e.g. human operators). This strategy allows the robot to modify its motion only when necessary, thus maintain a high level of productivity but at the same time increasing its versatility. Other than that, vision offers the possibility of more intuitive programming paradigms for the industrial robots as well, such as the programming by demonstration paradigm. These possibilities offered by artificial vision enable, as a matter of fact, an efficacious and promising way of achieving human-robot collaboration, which has the advantage of overcoming the limitations of both the previous paradigms yet keeping their strengths.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Autonomous landing is a challenging and important technology for both military and civilian applications of Unmanned Aerial Vehicles (UAVs). In this paper, we present a novel online adaptive visual tracking algorithm for UAVs to land on an arbitrary field (that can be used as the helipad) autonomously at real-time frame rates of more than twenty frames per second. The integration of low-dimensional subspace representation method, online incremental learning approach and hierarchical tracking strategy allows the autolanding task to overcome the problems generated by the challenging situations such as significant appearance change, variant surrounding illumination, partial helipad occlusion, rapid pose variation, onboard mechanical vibration (no video stabilization), low computational capacity and delayed information communication between UAV and Ground Control Station (GCS). The tracking performance of this presented algorithm is evaluated with aerial images from real autolanding flights using manually- labelled ground truth database. The evaluation results show that this new algorithm is highly robust to track the helipad and accurate enough for closing the vision-based control loop.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Autonomous landing is a challenging and important technology for both military and civilian applications of Unmanned Aerial Vehicles (UAVs). In this paper, we present a novel online adaptive visual tracking algorithm for UAVs to land on an arbitrary field (that can be used as the helipad) autonomously at real-time frame rates of more than twenty frames per second. The integration of low-dimensional subspace representation method, online incremental learning approach and hierarchical tracking strategy allows the autolanding task to overcome the problems generated by the challenging situations such as significant appearance change, variant surrounding illumination, partial helipad occlusion, rapid pose variation, onboard mechanical vibration (no video stabilization), low computational capacity and delayed information communication between UAV and Ground Control Station (GCS). The tracking performance of this presented algorithm is evaluated with aerial images from real autolanding flights using manually- labelled ground truth database. The evaluation results show that this new algorithm is highly robust to track the helipad and accurate enough for closing the vision-based control loop.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Sendo uma forma natural de interação homem-máquina, o reconhecimento de gestos implica uma forte componente de investigação em áreas como a visão por computador e a aprendizagem computacional. O reconhecimento gestual é uma área com aplicações muito diversas, fornecendo aos utilizadores uma forma mais natural e mais simples de comunicar com sistemas baseados em computador, sem a necessidade de utilização de dispositivos extras. Assim, o objectivo principal da investigação na área de reconhecimento de gestos aplicada à interacção homemmáquina é o da criação de sistemas, que possam identificar gestos específicos e usálos para transmitir informações ou para controlar dispositivos. Para isso as interfaces baseados em visão para o reconhecimento de gestos, necessitam de detectar a mão de forma rápida e robusta e de serem capazes de efetuar o reconhecimento de gestos em tempo real. Hoje em dia, os sistemas de reconhecimento de gestos baseados em visão são capazes de trabalhar com soluções específicas, construídos para resolver um determinado problema e configurados para trabalhar de uma forma particular. Este projeto de investigação estudou e implementou soluções, suficientemente genéricas, com o recurso a algoritmos de aprendizagem computacional, permitindo a sua aplicação num conjunto alargado de sistemas de interface homem-máquina, para reconhecimento de gestos em tempo real. A solução proposta, Gesture Learning Module Architecture (GeLMA), permite de forma simples definir um conjunto de comandos que pode ser baseado em gestos estáticos e dinâmicos e que pode ser facilmente integrado e configurado para ser utilizado numa série de aplicações. É um sistema de baixo custo e fácil de treinar e usar, e uma vez que é construído unicamente com bibliotecas de código. As experiências realizadas permitiram mostrar que o sistema atingiu uma precisão de 99,2% em termos de reconhecimento de gestos estáticos e uma precisão média de 93,7% em termos de reconhecimento de gestos dinâmicos. Para validar a solução proposta, foram implementados dois sistemas completos. O primeiro é um sistema em tempo real capaz de ajudar um árbitro a arbitrar um jogo de futebol robótico. A solução proposta combina um sistema de reconhecimento de gestos baseada em visão com a definição de uma linguagem formal, o CommLang Referee, à qual demos a designação de Referee Command Language Interface System (ReCLIS). O sistema identifica os comandos baseados num conjunto de gestos estáticos e dinâmicos executados pelo árbitro, sendo este posteriormente enviado para um interface de computador que transmite a respectiva informação para os robôs. O segundo é um sistema em tempo real capaz de interpretar um subconjunto da Linguagem Gestual Portuguesa. As experiências demonstraram que o sistema foi capaz de reconhecer as vogais em tempo real de forma fiável. Embora a solução implementada apenas tenha sido treinada para reconhecer as cinco vogais, o sistema é facilmente extensível para reconhecer o resto do alfabeto. As experiências também permitiram mostrar que a base dos sistemas de interação baseados em visão pode ser a mesma para todas as aplicações e, deste modo facilitar a sua implementação. A solução proposta tem ainda a vantagem de ser suficientemente genérica e uma base sólida para o desenvolvimento de sistemas baseados em reconhecimento gestual que podem ser facilmente integrados com qualquer aplicação de interface homem-máquina. A linguagem formal de definição da interface pode ser redefinida e o sistema pode ser facilmente configurado e treinado com um conjunto de gestos diferentes de forma a serem integrados na solução final.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

"Lecture notes in computational vision and biomechanics series, ISSN 2212-9391, vol. 19"

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Tese de Doutoramento em Engenharia de Eletrónica e de Computadores

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we seek to expand the use of direct methods in real-time applications by proposing a vision-based strategy for pose estimation of aerial vehicles. The vast majority of approaches make use of features to estimate motion. Conversely, the strategy we propose is based on a MR (Multi-Resolution) implementation of an image registration technique (Inverse Compositional Image Alignment ICIA) using direct methods. An on-board camera in a downwards-looking configuration, and the assumption of planar scenes, are the bases of the algorithm. The motion between frames (rotation and translation) is recovered by decomposing the frame-to-frame homography obtained by the ICIA algorithm applied to a patch that covers around the 80% of the image. When the visual estimation is required (e.g. GPS drop-out), this motion is integrated with the previous known estimation of the vehicles' state, obtained from the on-board sensors (GPS/IMU), and the subsequent estimations are based only on the vision-based motion estimations. The proposed strategy is tested with real flight data in representative stages of a flight: cruise, landing, and take-off, being two of those stages considered critical: take-off and landing. The performance of the pose estimation strategy is analyzed by comparing it with the GPS/IMU estimations. Results show correlation between the visual estimation obtained with the MR-ICIA and the GPS/IMU data, that demonstrate that the visual estimation can be used to provide a good approximation of the vehicle's state when it is required (e.g. GPS drop-outs). In terms of performance, the proposed strategy is able to maintain an estimation of the vehicle's state for more than one minute, at real-time frame rates based, only on visual information.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dissertação para obtenção do grau de Mestre em Engenharia Electrotécnica Ramo de Automação e Electrónica Industrial

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Sensor-based robot control allows manipulation in dynamic environments with uncertainties. Vision is a versatile low-cost sensory modality, but low sample rate, high sensor delay and uncertain measurements limit its usability, especially in strongly dynamic environments. Force is a complementary sensory modality allowing accurate measurements of local object shape when a tooltip is in contact with the object. In multimodal sensor fusion, several sensors measuring different modalities are combined to give a more accurate estimate of the environment. As force and vision are fundamentally different sensory modalities not sharing a common representation, combining the information from these sensors is not straightforward. In this thesis, methods for fusing proprioception, force and vision together are proposed. Making assumptions of object shape and modeling the uncertainties of the sensors, the measurements can be fused together in an extended Kalman filter. The fusion of force and visual measurements makes it possible to estimate the pose of a moving target with an end-effector mounted moving camera at high rate and accuracy. The proposed approach takes the latency of the vision system into account explicitly, to provide high sample rate estimates. The estimates also allow a smooth transition from vision-based motion control to force control. The velocity of the end-effector can be controlled by estimating the distance to the target by vision and determining the velocity profile giving rapid approach and minimal force overshoot. Experiments with a 5-degree-of-freedom parallel hydraulic manipulator and a 6-degree-of-freedom serial manipulator show that integration of several sensor modalities can increase the accuracy of the measurements significantly.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The papermaking industry has been continuously developing intelligent solutions to characterize the raw materials it uses, to control the manufacturing process in a robust way, and to guarantee the desired quality of the end product. Based on the much improved imaging techniques and image-based analysis methods, it has become possible to look inside the manufacturing pipeline and propose more effective alternatives to human expertise. This study is focused on the development of image analyses methods for the pulping process of papermaking. Pulping starts with wood disintegration and forming the fiber suspension that is subsequently bleached, mixed with additives and chemicals, and finally dried and shipped to the papermaking mills. At each stage of the process it is important to analyze the properties of the raw material to guarantee the product quality. In order to evaluate properties of fibers, the main component of the pulp suspension, a framework for fiber characterization based on microscopic images is proposed in this thesis as the first contribution. The framework allows computation of fiber length and curl index correlating well with the ground truth values. The bubble detection method, the second contribution, was developed in order to estimate the gas volume at the delignification stage of the pulping process based on high-resolution in-line imaging. The gas volume was estimated accurately and the solution enabled just-in-time process termination whereas the accurate estimation of bubble size categories still remained challenging. As the third contribution of the study, optical flow computation was studied and the methods were successfully applied to pulp flow velocity estimation based on double-exposed images. Finally, a framework for classifying dirt particles in dried pulp sheets, including the semisynthetic ground truth generation, feature selection, and performance comparison of the state-of-the-art classification techniques, was proposed as the fourth contribution. The framework was successfully tested on the semisynthetic and real-world pulp sheet images. These four contributions assist in developing an integrated factory-level vision-based process control.