936 resultados para 100Hz vision-based state estimator
Resumo:
Autonomous aerial refueling is a key enabling technology for both manned and unmanned aircraft where extended flight duration or range are required. The results presented within this paper offer one potential vision-based sensing solution, together with a unique test environment. A hierarchical visual tracking algorithm based on direct methods is proposed and developed for the purposes of tracking a drogue during the capture stage of autonomous aerial refueling, and of estimating its 3D position. Intended to be applied in real time to a video stream from a single monocular camera mounted on the receiver aircraft, the algorithm is shown to be highly robust, and capable of tracking large, rapid drogue motions within the frame of reference. The proposed strategy has been tested using a complex robotic testbed and with actual flight hardware consisting of a full size probe and drogue. Results show that the vision tracking algorithm can detect and track the drogue at real-time frame rates of more than thirty frames per second, obtaining a robust position estimation even with strong motions and multiple occlusions of the drogue.
Resumo:
In this paper, we apply a hierarchical tracking strategy of planar objects (or that can be assumed to be planar) that is based on direct methods for vision-based applications on-board UAVs. The use of this tracking strategy allows to achieve the tasks at real-time frame rates and to overcome problems posed by the challenging conditions of the tasks: e.g. constant vibrations, fast 3D changes, or limited capacity on-board. The vast majority of approaches make use of feature-based methods to track objects. Nonetheless, in this paper we show that although some of these feature-based solutions are faster, direct methods can be more robust under fast 3D motions (fast changes in position), some changes in appearance, constant vibrations (without requiring any specific hardware or software for video stabilization), and situations in which part of the object to track is outside of the field of view of the camera. The performance of the proposed tracking strategy on-board UAVs is evaluated with images from realflight tests using manually-generated ground truth information, accurate position estimation using a Vicon system, and also with simulated data from a simulation environment. Results show that the hierarchical tracking strategy performs better than wellknown feature-based algorithms and well-known configurations of direct methods, and that its performance is robust enough for vision-in-the-loop tasks, e.g. for vision-based landing tasks.
Resumo:
The importance of vision-based systems for Sense-and-Avoid is increasing nowadays as remotely piloted and autonomous UAVs become part of the non-segregated airspace. The development and evaluation of these systems demand flight scenario images which are expensive and risky to obtain. Currently Augmented Reality techniques allow the compositing of real flight scenario images with 3D aircraft models to produce useful realistic images for system development and benchmarking purposes at a much lower cost and risk. With the techniques presented in this paper, 3D aircraft models are positioned firstly in a simulated 3D scene with controlled illumination and rendering parameters. Realistic simulated images are then obtained using an image processing algorithm which fuses the images obtained from the 3D scene with images from real UAV flights taking into account on board camera vibrations. Since the intruder and camera poses are user-defined, ground truth data is available. These ground truth annotations allow to develop and quantitatively evaluate aircraft detection and tracking algorithms. This paper presents the software developed to create a public dataset of 24 videos together with their annotations and some tracking application results.
Resumo:
This paper presents a completely autonomous solution to participate in the Indoor Challenge of the 2013 International Micro Air Vehicle Competition (IMAV 2013). Our proposal is a multi-robot system with no centralized coordination whose robotic agents share their position estimates. The capability of each agent to navigate avoiding collisions is a consequence of the resulting emergent behavior. Each agent consists of a ground station running an instance of the proposed architecture that communicates over WiFi with an AR Drone 2.0 quadrotor. Visual markers are employed to sense and map obstacles and to improve the pose estimation based on Inertial Measurement Unit (IMU) and ground optical flow data. Based on our architecture, each robotic agent can navigate avoiding obstacles and other members of the multi-robot system. The solution is demonstrated and the achieved navigation performance is evaluated by means of experimental flights. This work also analyzes the capabilities of the presented solution in simulated flights of the IMAV 2013 Indoor Challenge. The performance of the CVG UPM team was awarded with the First Prize in the Indoor Autonomy Challenge of the IMAV 2013 competition.
Resumo:
The Environmental Sciences Division within Queensland Environmental Protection Agency works to monitor, assess and model the condition of the environment. The Division has as a legislative responsibility to produce a whole-of-government report every four years dealing environmental conditions and trends in a ”State of the Environment report” (SoE)[1][2][3]. State of Environment Web Service Reporting System is a supplementary web service based SoE reporting tool, which aims to deliver accurate, timely and accessible information on the condition of the environment through web services via Internet [4][5]. This prototype provides a scientific assessment of environmental conditions for a set of environmental indicators. It contains text descriptions and tables, charts and maps with spatiotemporal dimensions to show the impact of certain environmental indicators on our environment. This prototype is a template based indicator system, to which the administrator may add new sql queries for new indicator services without changing the architecture and codes of this template. The benefits are brought through a service-oriented architecture which provides an online query service with seamless integration. In addition, since it uses web service architecture, each individual component within the application can be implemented by using different programming languages and in different operating systems. Although the services showed in this demo are built upon two datasets of regional ecosystem and protection area of Queensland, it will be possible to report on the condition of water, air, land, coastal zones, energy resources, biodiversity, human settlements and natural culture heritage on the fly as well. Figure 1 shows the architecture of the prototype. In the next section, I will discuss the research tasks in the prototype.
Resumo:
[EN]Vision-based applications designed for humanmachine interaction require fast and accurate hand detection. However, previous works on this field assume different constraints, like a limitation in the number of detected gestures, because hands are highly complex objects to locate. This paper presents an approach which changes the detection target without limiting the number of detected gestures. Using a cascade classifier we detect hands based on their wrists. With this approach, we introduce two main contributions: (1) a reliable segmentation, independently of the gesture being made and (2) a training phase faster than previous cascade classifier based methods. The paper includes experimental evaluations with different video streams that illustrate the efficiency and suitability for perceptual interfaces.
Resumo:
The structure of an animal’s eye is determined by the tasks it must perform. While vertebrates rely on their two eyes for all visual functions, insects have evolved a wide range of specialized visual organs to support behaviors such as prey capture, predator evasion, mate pursuit, flight stabilization, and navigation. Compound eyes and ocelli constitute the vision forming and sensing mechanisms of some flying insects. They provide signals useful for flight stabilization and navigation. In contrast to the well-studied compound eye, the ocelli, seen as the second visual system, sense fast luminance changes and allows for fast visual processing. Using a luminance-based sensor that mimics the insect ocelli and a camera-based motion detection system, a frequency-domain characterization of an ocellar sensor and optic flow (due to rotational motion) are analyzed. Inspired by the insect neurons that make use of signals from both vision sensing mechanisms, advantages, disadvantages and complementary properties of ocellar and optic flow estimates are discussed.
Resumo:
Industrial robots are both versatile and high performant, enabling the flexible automation typical of the modern Smart Factories. For safety reasons, however, they must be relegated inside closed fences and/or virtual safety barriers, to keep them strictly separated from human operators. This can be a limitation in some scenarios in which it is useful to combine the human cognitive skill with the accuracy and repeatability of a robot, or simply to allow a safe coexistence in a shared workspace. Collaborative robots (cobots), on the other hand, are intrinsically limited in speed and power in order to share workspace and tasks with human operators, and feature the very intuitive hand guiding programming method. Cobots, however, cannot compete with industrial robots in terms of performance, and are thus useful only in a limited niche, where they can actually bring an improvement in productivity and/or in the quality of the work thanks to their synergy with human operators. The limitations of both the pure industrial and the collaborative paradigms can be overcome by combining industrial robots with artificial vision. In particular, vision can be exploited for a real-time adjustment of the pre-programmed task-based robot trajectory, by means of the visual tracking of dynamic obstacles (e.g. human operators). This strategy allows the robot to modify its motion only when necessary, thus maintain a high level of productivity but at the same time increasing its versatility. Other than that, vision offers the possibility of more intuitive programming paradigms for the industrial robots as well, such as the programming by demonstration paradigm. These possibilities offered by artificial vision enable, as a matter of fact, an efficacious and promising way of achieving human-robot collaboration, which has the advantage of overcoming the limitations of both the previous paradigms yet keeping their strengths.
Resumo:
In this paper we tackle the problem of landing a helicopter autonomously on a ship deck, using as the main sensor, an on-board colour camera. To create a test-bed, we first adequately simulate the movement of a ship landing platform on the Sea, for different Sea States, for different ships, randomly and realistically enough. We use a commercial parallel robot to get this movement. Once we had this, we developed an accurate and robust computer vision system to measure the pose of the helipad with respect to the on-board camera. To deal with the noise and the possible fails of the computer vision, a state estimator was created. With all of this, we are now able to develop and test a controller that closes the loop and finish the autonomous landing task.
Resumo:
Sendo uma forma natural de interação homem-máquina, o reconhecimento de gestos implica uma forte componente de investigação em áreas como a visão por computador e a aprendizagem computacional. O reconhecimento gestual é uma área com aplicações muito diversas, fornecendo aos utilizadores uma forma mais natural e mais simples de comunicar com sistemas baseados em computador, sem a necessidade de utilização de dispositivos extras. Assim, o objectivo principal da investigação na área de reconhecimento de gestos aplicada à interacção homemmáquina é o da criação de sistemas, que possam identificar gestos específicos e usálos para transmitir informações ou para controlar dispositivos. Para isso as interfaces baseados em visão para o reconhecimento de gestos, necessitam de detectar a mão de forma rápida e robusta e de serem capazes de efetuar o reconhecimento de gestos em tempo real. Hoje em dia, os sistemas de reconhecimento de gestos baseados em visão são capazes de trabalhar com soluções específicas, construídos para resolver um determinado problema e configurados para trabalhar de uma forma particular. Este projeto de investigação estudou e implementou soluções, suficientemente genéricas, com o recurso a algoritmos de aprendizagem computacional, permitindo a sua aplicação num conjunto alargado de sistemas de interface homem-máquina, para reconhecimento de gestos em tempo real. A solução proposta, Gesture Learning Module Architecture (GeLMA), permite de forma simples definir um conjunto de comandos que pode ser baseado em gestos estáticos e dinâmicos e que pode ser facilmente integrado e configurado para ser utilizado numa série de aplicações. É um sistema de baixo custo e fácil de treinar e usar, e uma vez que é construído unicamente com bibliotecas de código. As experiências realizadas permitiram mostrar que o sistema atingiu uma precisão de 99,2% em termos de reconhecimento de gestos estáticos e uma precisão média de 93,7% em termos de reconhecimento de gestos dinâmicos. Para validar a solução proposta, foram implementados dois sistemas completos. O primeiro é um sistema em tempo real capaz de ajudar um árbitro a arbitrar um jogo de futebol robótico. A solução proposta combina um sistema de reconhecimento de gestos baseada em visão com a definição de uma linguagem formal, o CommLang Referee, à qual demos a designação de Referee Command Language Interface System (ReCLIS). O sistema identifica os comandos baseados num conjunto de gestos estáticos e dinâmicos executados pelo árbitro, sendo este posteriormente enviado para um interface de computador que transmite a respectiva informação para os robôs. O segundo é um sistema em tempo real capaz de interpretar um subconjunto da Linguagem Gestual Portuguesa. As experiências demonstraram que o sistema foi capaz de reconhecer as vogais em tempo real de forma fiável. Embora a solução implementada apenas tenha sido treinada para reconhecer as cinco vogais, o sistema é facilmente extensível para reconhecer o resto do alfabeto. As experiências também permitiram mostrar que a base dos sistemas de interação baseados em visão pode ser a mesma para todas as aplicações e, deste modo facilitar a sua implementação. A solução proposta tem ainda a vantagem de ser suficientemente genérica e uma base sólida para o desenvolvimento de sistemas baseados em reconhecimento gestual que podem ser facilmente integrados com qualquer aplicação de interface homem-máquina. A linguagem formal de definição da interface pode ser redefinida e o sistema pode ser facilmente configurado e treinado com um conjunto de gestos diferentes de forma a serem integrados na solução final.
Resumo:
"Lecture notes in computational vision and biomechanics series, ISSN 2212-9391, vol. 19"
Resumo:
Tese de Doutoramento em Engenharia de Eletrónica e de Computadores
Resumo:
The papermaking industry has been continuously developing intelligent solutions to characterize the raw materials it uses, to control the manufacturing process in a robust way, and to guarantee the desired quality of the end product. Based on the much improved imaging techniques and image-based analysis methods, it has become possible to look inside the manufacturing pipeline and propose more effective alternatives to human expertise. This study is focused on the development of image analyses methods for the pulping process of papermaking. Pulping starts with wood disintegration and forming the fiber suspension that is subsequently bleached, mixed with additives and chemicals, and finally dried and shipped to the papermaking mills. At each stage of the process it is important to analyze the properties of the raw material to guarantee the product quality. In order to evaluate properties of fibers, the main component of the pulp suspension, a framework for fiber characterization based on microscopic images is proposed in this thesis as the first contribution. The framework allows computation of fiber length and curl index correlating well with the ground truth values. The bubble detection method, the second contribution, was developed in order to estimate the gas volume at the delignification stage of the pulping process based on high-resolution in-line imaging. The gas volume was estimated accurately and the solution enabled just-in-time process termination whereas the accurate estimation of bubble size categories still remained challenging. As the third contribution of the study, optical flow computation was studied and the methods were successfully applied to pulp flow velocity estimation based on double-exposed images. Finally, a framework for classifying dirt particles in dried pulp sheets, including the semisynthetic ground truth generation, feature selection, and performance comparison of the state-of-the-art classification techniques, was proposed as the fourth contribution. The framework was successfully tested on the semisynthetic and real-world pulp sheet images. These four contributions assist in developing an integrated factory-level vision-based process control.
Resumo:
L’analyse de la marche a émergé comme l’un des domaines médicaux le plus im- portants récemment. Les systèmes à base de marqueurs sont les méthodes les plus fa- vorisées par l’évaluation du mouvement humain et l’analyse de la marche, cependant, ces systèmes nécessitent des équipements et de l’expertise spécifiques et sont lourds, coûteux et difficiles à utiliser. De nombreuses approches récentes basées sur la vision par ordinateur ont été développées pour réduire le coût des systèmes de capture de mou- vement tout en assurant un résultat de haute précision. Dans cette thèse, nous présentons notre nouveau système d’analyse de la démarche à faible coût, qui est composé de deux caméras vidéo monoculaire placées sur le côté gauche et droit d’un tapis roulant. Chaque modèle 2D de la moitié du squelette humain est reconstruit à partir de chaque vue sur la base de la segmentation dynamique de la couleur, l’analyse de la marche est alors effectuée sur ces deux modèles. La validation avec l’état de l’art basée sur la vision du système de capture de mouvement (en utilisant le Microsoft Kinect) et la réalité du ter- rain (avec des marqueurs) a été faite pour démontrer la robustesse et l’efficacité de notre système. L’erreur moyenne de l’estimation du modèle de squelette humain par rapport à la réalité du terrain entre notre méthode vs Kinect est très prometteur: les joints des angles de cuisses (6,29◦ contre 9,68◦), jambes (7,68◦ contre 11,47◦), pieds (6,14◦ contre 13,63◦), la longueur de la foulée (6.14cm rapport de 13.63cm) sont meilleurs et plus stables que ceux de la Kinect, alors que le système peut maintenir une précision assez proche de la Kinect pour les bras (7,29◦ contre 6,12◦), les bras inférieurs (8,33◦ contre 8,04◦), et le torse (8,69◦contre 6,47◦). Basé sur le modèle de squelette obtenu par chaque méthode, nous avons réalisé une étude de symétrie sur différentes articulations (coude, genou et cheville) en utilisant chaque méthode sur trois sujets différents pour voir quelle méthode permet de distinguer plus efficacement la caractéristique symétrie / asymétrie de la marche. Dans notre test, notre système a un angle de genou au maximum de 8,97◦ et 13,86◦ pour des promenades normale et asymétrique respectivement, tandis que la Kinect a donné 10,58◦et 11,94◦. Par rapport à la réalité de terrain, 7,64◦et 14,34◦, notre système a montré une plus grande précision et pouvoir discriminant entre les deux cas.
Resumo:
A new state estimator algorithm is based on a neurofuzzy network and the Kalman filter algorithm. The major contribution of the paper is recognition of a bias problem in the parameter estimation of the state-space model and the introduction of a simple, effective prefiltering method to achieve unbiased parameter estimates in the state-space model, which will then be applied for state estimation using the Kalman filtering algorithm. Fundamental to this method is a simple prefiltering procedure using a nonlinear principal component analysis method based on the neurofuzzy basis set. This prefiltering can be performed without prior system structure knowledge. Numerical examples demonstrate the effectiveness of the new approach.