916 resultados para Computer-vision
Resumo:
This work presents a collision avoidance approach based on omnidirectional cameras that does not require the estimation of range between two platforms to resolve a collision encounter. Our method achieves minimum separation between the two vehicles involved by maximising the view-angle given by the omnidirectional sensor. Only visual information is used to achieve avoidance under a bearing- only visual servoing approach. We provide theoretical problem formulation, as well as results from real flights using small quadrotors
Resumo:
The problem of estimating pseudobearing rate information of an airborne target based on measurements from a vision sensor is considered. Novel image speed and heading angle estimators are presented that exploit image morphology, hidden Markov model (HMM) filtering, and relative entropy rate (RER) concepts to allow pseudobearing rate information to be determined before (or whilst) the target track is being estimated from vision information.
Resumo:
Next-generation autonomous underwater vehicles (AUVs) will be required to robustly identify underwater targets for tasks such as inspection, localization, and docking. Given their often unstructured operating environments, vision offers enormous potential in underwater navigation over more traditional methods; however, reliable target segmentation often plagues these systems. This paper addresses robust vision-based target recognition by presenting a novel scale and rotationally invariant target design and recognition routine based on self-similar landmarks that enables robust target pose estimation with respect to a single camera. These algorithms are applied to an AUV with controllers developed for vision-based docking with the target. Experimental results show that the system performs exceptionally on limited processing power and demonstrates how the combined vision and controller system enables robust target identification and docking in a variety of operating conditions.
Resumo:
This paper describes a novel vision based texture tracking method to guide autonomous vehicles in agricultural fields where the crop rows are challenging to detect. Existing methods require sufficient visual difference between the crop and soil for segmentation, or explicit knowledge of the structure of the crop rows. This method works by extracting and tracking the direction and lateral offset of the dominant parallel texture in a simulated overhead view of the scene and hence abstracts away crop-specific details such as colour, spacing and periodicity. The results demonstrate that the method is able to track crop rows across fields with extremely varied appearance during day and night. We demonstrate this method can autonomously guide a robot along the crop rows.
Resumo:
The mining industry is highly suitable for the application of robotics and automation technology since the work is both arduous and dangerous. Visual servoing is a means of integrating noncontact visual sensing with machine control to augment or replace operator based control. This article describes two of our current mining automation projects in order to demonstrate some, perhaps unusual, applications of visual servoing, and also to illustrate some very real problems with robust computer vision
Resumo:
The International Journal of Robotics Research (IJRR) has a long history of publishing the state-of-the-art in the field of robotic vision. This is the fourth special issue devoted to the topic. Previous special issues were published in 2012 (Volume 31, No. 4), 2010 (Volume 29, Nos 2–3) and 2007 (Volume 26, No. 7, jointly with the International Journal of Computer Vision). In a closely related field was the special issue on Visual Servoing published in IJRR, 2003 (Volume 22, Nos 10–11). These issues nicely summarize the highlights and progress of the past 12 years of research devoted to the use of visual perception for robotics.
Resumo:
Although robotics research has seen advances over the last decades robots are still not in widespread use outside industrial applications. Yet a range of proposed scenarios have robots working together, helping and coexisting with humans in daily life. In all these a clear need to deal with a more unstructured, changing environment arises. I herein present a system that aims to overcome the limitations of highly complex robotic systems, in terms of autonomy and adaptation. The main focus of research is to investigate the use of visual feedback for improving reaching and grasping capabilities of complex robots. To facilitate this a combined integration of computer vision and machine learning techniques is employed. From a robot vision point of view the combination of domain knowledge from both imaging processing and machine learning techniques, can expand the capabilities of robots. I present a novel framework called Cartesian Genetic Programming for Image Processing (CGP-IP). CGP-IP can be trained to detect objects in the incoming camera streams and successfully demonstrated on many different problem domains. The approach requires only a few training images (it was tested with 5 to 10 images per experiment) is fast, scalable and robust yet requires very small training sets. Additionally, it can generate human readable programs that can be further customized and tuned. While CGP-IP is a supervised-learning technique, I show an integration on the iCub, that allows for the autonomous learning of object detection and identification. Finally this dissertation includes two proof-of-concepts that integrate the motion and action sides. First, reactive reaching and grasping is shown. It allows the robot to avoid obstacles detected in the visual stream, while reaching for the intended target object. Furthermore the integration enables us to use the robot in non-static environments, i.e. the reaching is adapted on-the- fly from the visual feedback received, e.g. when an obstacle is moved into the trajectory. The second integration highlights the capabilities of these frameworks, by improving the visual detection by performing object manipulation actions.
Resumo:
This paper provides a comprehensive review of the vision-based See and Avoid problem for unmanned aircraft. The unique problem environment and associated constraints are detailed, followed by an in-depth analysis of visual sensing limitations. In light of such detection and estimation constraints, relevant human, aircraft and robot collision avoidance concepts are then compared from a decision and control perspective. Remarks on system evaluation and certification are also included to provide a holistic review approach. The intention of this work is to clarify common misconceptions, realistically bound feasible design expectations and offer new research directions. It is hoped that this paper will help us to unify design efforts across the aerospace and robotics communities.
Resumo:
The vision sense of standalone robots is limited by line of sight and onboard camera capabilities, but processing video from remote cameras puts a high computational burden on robots. This paper describes the Distributed Robotic Vision Service, DRVS, which implements an on-demand distributed visual object detection service. Robots specify visual information requirements in terms of regions of interest and object detection algorithms. DRVS dynamically distributes the object detection computation to remote vision systems with processing capabilities, and the robots receive high-level object detection information. DRVS relieves robots of managing sensor discovery and reduces data transmission compared to image sharing models of distributed vision. Navigating a sensorless robot from remote vision systems is demonstrated in simulation as a proof of concept.
Resumo:
Robotic vision is limited by line of sight and onboard camera capabilities. Robots can acquire video or images from remote cameras, but processing additional data has a computational burden. This paper applies the Distributed Robotic Vision Service, DRVS, to robot path planning using data outside line-of-sight of the robot. DRVS implements a distributed visual object detection service to distributes the computation to remote camera nodes with processing capabilities. Robots request task-specific object detection from DRVS by specifying a geographic region of interest and object type. The remote camera nodes perform the visual processing and send the high-level object information to the robot. Additionally, DRVS relieves robots of sensor discovery by dynamically distributing object detection requests to remote camera nodes. Tested over two different indoor path planning tasks DRVS showed dramatic reduction in mobile robot compute load and wireless network utilization.
Resumo:
Detect and Avoid (DAA) technology is widely acknowledged as a critical enabler for unsegregated Remote Piloted Aircraft (RPA) operations, particularly Beyond Visual Line of Sight (BVLOS). Image-based DAA, in the visible spectrum, is a promising technological option for addressing the challenges DAA presents. Two impediments to progress for this approach are the scarcity of available video footage to train and test algorithms, in conjunction with testing regimes and specifications which facilitate repeatable, statistically valid, performance assessment. This paper includes three key contributions undertaken to address these impediments. In the first instance, we detail our progress towards the creation of a large hybrid collision and near-collision encounter database. Second, we explore the suitability of techniques employed by the biometric research community (Speaker Verification and Language Identification), for DAA performance optimisation and assessment. These techniques include Detection Error Trade-off (DET) curves, Equal Error Rates (EER), and the Detection Cost Function (DCF). Finally, the hybrid database and the speech-based techniques are combined and employed in the assessment of a contemporary, image based DAA system. This system includes stabilisation, morphological filtering and a Hidden Markov Model (HMM) temporal filter.
Resumo:
This thesis addresses a series of topics related to the question of how people find the foreground objects from complex scenes. With both computer vision modeling, as well as psychophysical analyses, we explore the computational principles for low- and mid-level vision.
We first explore the computational methods of generating saliency maps from images and image sequences. We propose an extremely fast algorithm called Image Signature that detects the locations in the image that attract human eye gazes. With a series of experimental validations based on human behavioral data collected from various psychophysical experiments, we conclude that the Image Signature and its spatial-temporal extension, the Phase Discrepancy, are among the most accurate algorithms for saliency detection under various conditions.
In the second part, we bridge the gap between fixation prediction and salient object segmentation with two efforts. First, we propose a new dataset that contains both fixation and object segmentation information. By simultaneously presenting the two types of human data in the same dataset, we are able to analyze their intrinsic connection, as well as understanding the drawbacks of today’s “standard” but inappropriately labeled salient object segmentation dataset. Second, we also propose an algorithm of salient object segmentation. Based on our novel discoveries on the connections of fixation data and salient object segmentation data, our model significantly outperforms all existing models on all 3 datasets with large margins.
In the third part of the thesis, we discuss topics around the human factors of boundary analysis. Closely related to salient object segmentation, boundary analysis focuses on delimiting the local contours of an object. We identify the potential pitfalls of algorithm evaluation for the problem of boundary detection. Our analysis indicates that today’s popular boundary detection datasets contain significant level of noise, which may severely influence the benchmarking results. To give further insights on the labeling process, we propose a model to characterize the principles of the human factors during the labeling process.
The analyses reported in this thesis offer new perspectives to a series of interrelating issues in low- and mid-level vision. It gives warning signs to some of today’s “standard” procedures, while proposing new directions to encourage future research.
Resumo:
The visual system is a remarkable platform that evolved to solve difficult computational problems such as detection, recognition, and classification of objects. Of great interest is the face-processing network, a sub-system buried deep in the temporal lobe, dedicated for analyzing specific type of objects (faces). In this thesis, I focus on the problem of face detection by the face-processing network. Insights obtained from years of developing computer-vision algorithms to solve this task have suggested that it may be efficiently and effectively solved by detection and integration of local contrast features. Does the brain use a similar strategy? To answer this question, I embark on a journey that takes me through the development and optimization of dedicated tools for targeting and perturbing deep brain structures. Data collected using MR-guided electrophysiology in early face-processing regions was found to have strong selectivity for contrast features, similar to ones used by artificial systems. While individual cells were tuned for only a small subset of features, the population as a whole encoded the full spectrum of features that are predictive to the presence of a face in an image. Together with additional evidence, my results suggest a possible computational mechanism for face detection in early face processing regions. To move from correlation to causation, I focus on adopting an emergent technology for perturbing brain activity using light: optogenetics. While this technique has the potential to overcome problems associated with the de-facto way of brain stimulation (electrical microstimulation), many open questions remain about its applicability and effectiveness for perturbing the non-human primate (NHP) brain. In a set of experiments, I use viral vectors to deliver genetically encoded optogenetic constructs to the frontal eye field and faceselective regions in NHP and examine their effects side-by-side with electrical microstimulation to assess their effectiveness in perturbing neural activity as well as behavior. Results suggest that cells are robustly and strongly modulated upon light delivery and that such perturbation can modulate and even initiate motor behavior, thus, paving the way for future explorations that may apply these tools to study connectivity and information flow in the face processing network.