885 resultados para SIFT,Computer Vision,Python,Object Recognition,Feature Detection,Descriptor Computation
Resumo:
Safety concerns in the operation of autonomous aerial systems require safe-landing protocols be followed during situations where the a mission should be aborted due to mechanical or other failure. On-board cameras provide information that can be used in the determination of potential landing sites, which are continually updated and ranked to prevent injury and minimize damage. Pulse Coupled Neural Networks have been used for the detection of features in images that assist in the classification of vegetation and can be used to minimize damage to the aerial vehicle. However, a significant drawback in the use of PCNNs is that they are computationally expensive and have been more suited to off-line applications on conventional computing architectures. As heterogeneous computing architectures are becoming more common, an OpenCL implementation of a PCNN feature generator is presented and its performance is compared across OpenCL kernels designed for CPU, GPU and FPGA platforms. This comparison examines the compute times required for network convergence under a variety of images obtained during unmanned aerial vehicle trials to determine the plausibility for real-time feature detection.
Resumo:
The ability to automate forced landings in an emergency such as engine failure is an essential ability to improve the safety of Unmanned Aerial Vehicles operating in General Aviation airspace. By using active vision to detect safe landing zones below the aircraft, the reliability and safety of such systems is vastly improved by gathering up-to-the-minute information about the ground environment. This paper presents the Site Detection System, a methodology utilising a downward facing camera to analyse the ground environment in both 2D and 3D, detect safe landing sites and characterise them according to size, shape, slope and nearby obstacles. A methodology is presented showing the fusion of landing site detection from 2D imagery with a coarse Digital Elevation Map and dense 3D reconstructions using INS-aided Structure-from-Motion to improve accuracy. Results are presented from an experimental flight showing the precision/recall of landing sites in comparison to a hand-classified ground truth, and improved performance with the integration of 3D analysis from visual Structure-from-Motion.
Resumo:
Safety concerns in the operation of autonomous aerial systems require safe-landing protocols be followed during situations where the mission should be aborted due to mechanical or other failure. This article presents a pulse-coupled neural network (PCNN) to assist in the vegetation classification in a vision-based landing site detection system for an unmanned aircraft. We propose a heterogeneous computing architecture and an OpenCL implementation of a PCNN feature generator. Its performance is compared across OpenCL kernels designed for CPU, GPU, and FPGA platforms. This comparison examines the compute times required for network convergence under a variety of images to determine the plausibility for real-time feature detection.
Resumo:
Traditional nearest points methods use all the samples in an image set to construct a single convex or affine hull model for classification. However, strong artificial features and noisy data may be generated from combinations of training samples when significant intra-class variations and/or noise occur in the image set. Existing multi-model approaches extract local models by clustering each image set individually only once, with fixed clusters used for matching with various image sets. This may not be optimal for discrimination, as undesirable environmental conditions (eg. illumination and pose variations) may result in the two closest clusters representing different characteristics of an object (eg. frontal face being compared to non-frontal face). To address the above problem, we propose a novel approach to enhance nearest points based methods by integrating affine/convex hull classification with an adapted multi-model approach. We first extract multiple local convex hulls from a query image set via maximum margin clustering to diminish the artificial variations and constrain the noise in local convex hulls. We then propose adaptive reference clustering (ARC) to constrain the clustering of each gallery image set by forcing the clusters to have resemblance to the clusters in the query image set. By applying ARC, noisy clusters in the query set can be discarded. Experiments on Honda, MoBo and ETH-80 datasets show that the proposed method outperforms single model approaches and other recent techniques, such as Sparse Approximated Nearest Points, Mutual Subspace Method and Manifold Discriminant Analysis.
Resumo:
Existing multi-model approaches for image set classification extract local models by clustering each image set individually only once, with fixed clusters used for matching with other image sets. However, this may result in the two closest clusters to represent different characteristics of an object, due to different undesirable environmental conditions (such as variations in illumination and pose). To address this problem, we propose to constrain the clustering of each query image set by forcing the clusters to have resemblance to the clusters in the gallery image sets. We first define a Frobenius norm distance between subspaces over Grassmann manifolds based on reconstruction error. We then extract local linear subspaces from a gallery image set via sparse representation. For each local linear subspace, we adaptively construct the corresponding closest subspace from the samples of a probe image set by joint sparse representation. We show that by minimising the sparse representation reconstruction error, we approach the nearest point on a Grassmann manifold. Experiments on Honda, ETH-80 and Cambridge-Gesture datasets show that the proposed method consistently outperforms several other recent techniques, such as Affine Hull based Image Set Distance (AHISD), Sparse Approximated Nearest Points (SANP) and Manifold Discriminant Analysis (MDA).
Resumo:
This paper investigates how neuronal activation for naming photographs of objects is influenced by the addition of appropriate colour or sound. Behaviourally, both colour and sound are known to facilitate object recognition from visual form. However, previous functional imaging studies have shown inconsistent effects. For example, the addition of appropriate colour has been shown to reduce antero-medial temporal activation whereas the addition of sound has been shown to increase posterior superior temporal activation. Here we compared the effect of adding colour or sound cues in the same experiment. We found that the addition of either the appropriate colour or sound increased activation for naming photographs of objects in bilateral occipital regions and the right anterior fusiform. Moreover, the addition of colour reduced left antero-medial temporal activation but this effect was not observed for the addition of object sound. We propose that activation in bilateral occipital and right fusiform areas precedes the integration of visual form with either its colour or associated sound. In contrast, left antero-medial temporal activation is reduced because object recognition is facilitated after colour and form have been integrated.
Resumo:
Recent modelling of socio-economic costs by the Australian railway industry in 2010 has estimated the cost of level crossing accidents to exceed AU$116 million annually. To better understand the causal factors of these accidents, a video analytics application is being developed to automatically detect near-miss incidents using forward facing videos from trains. As near-miss events occur more frequently than collisions, by detecting these occurrences there will be more safety data available for analysis. The application that is being developed will improve the objectivity of near-miss reporting by providing quantitative data about the position of vehicles at level crossings through the automatic analysis of video footage. In this paper we present a novel method for detecting near-miss occurrences at railway level crossings from video data of trains. Our system detects and localizes vehicles at railway level crossings. It also detects the position of railways to calculate the distance of the detected vehicles to the railway centerline. The system logs the information about the position of the vehicles and railway centerline into a database for further analysis by the safety data recording and analysis system, to determine whether or not the event is a near-miss. We present preliminary results of our system on a dataset of videos taken from a train that passed through 14 railway level crossings. We demonstrate the robustness of our system by showing the results of our system on day and night videos.
Resumo:
Abnormal event detection has attracted a lot of attention in the computer vision research community during recent years due to the increased focus on automated surveillance systems to improve security in public places. Due to the scarcity of training data and the definition of an abnormality being dependent on context, abnormal event detection is generally formulated as a data-driven approach where activities are modeled in an unsupervised fashion during the training phase. In this work, we use a Gaussian mixture model (GMM) to cluster the activities during the training phase, and propose a Gaussian mixture model based Markov random field (GMM-MRF) to estimate the likelihood scores of new videos in the testing phase. Further-more, we propose two new features: optical acceleration, and the histogram of optical flow gradients; to detect the presence of any abnormal objects and speed violations in the scene. We show that our proposed method outperforms other state of the art abnormal event detection algorithms on publicly available UCSD dataset.
Resumo:
This paper presents an online, unsupervised training algorithm enabling vision-based place recognition across a wide range of changing environmental conditions such as those caused by weather, seasons, and day-night cycles. The technique applies principal component analysis to distinguish between aspects of a location’s appearance that are condition-dependent and those that are condition-invariant. Removing the dimensions associated with environmental conditions produces condition-invariant images that can be used by appearance-based place recognition methods. This approach has a unique benefit – it requires training images from only one type of environmental condition, unlike existing data-driven methods that require training images with labelled frame correspondences from two or more environmental conditions. The method is applied to two benchmark variable condition datasets. Performance is equivalent or superior to the current state of the art despite the lesser training requirements, and is demonstrated to generalise to previously unseen locations.
Resumo:
In studies of germ cell transplantation, measureing tubule diameters and counting cells from different populations using antibodies as markers are very important. Manual measurement of tubule sizes and cell counts is a tedious and sanity grinding work. In this paper, we propose a new boundary weighting based tubule detection method. We first enhance the linear features of the input image and detect the approximate centers of tubules. Next, a boundary weighting transform is applied to the polar transformed image of each tubule region and a circular shortest path is used for the boundary detection. Then, ellipse fitting is carried out for tubule selection and measurement. The algorithm has been tested on a dataset consisting of 20 images, each having about 20 tubules. Experiments show that the detection results of our algorithm are very close to the results obtained manually. © 2013 IEEE.
Resumo:
We propose the use of optical flow information as a method for detecting and describing changes in the environment, from the perspective of a mobile camera. We analyze the characteristics of the optical flow signal and demonstrate how robust flow vectors can be generated and used for the detection of depth discontinuities and appearance changes at key locations. To successfully achieve this task, a full discussion on camera positioning, distortion compensation, noise filtering, and parameter estimation is presented. We then extract statistical attributes from the flow signal to describe the location of the scene changes. We also employ clustering and dominant shape of vectors to increase the descriptiveness. Once a database of nodes (where a node is a detected scene change) and their corresponding flow features is created, matching can be performed whenever nodes are encountered, such that topological localization can be achieved. We retrieve the most likely node according to the Mahalanobis and Chi-square distances between the current frame and the database. The results illustrate the applicability of the technique for detecting and describing scene changes in diverse lighting conditions, considering indoor and outdoor environments and different robot platforms.
Reactive reaching and grasping on a humanoid: Towards closing the action-perception loop on the iCub
Resumo:
We propose a system incorporating a tight integration between computer vision and robot control modules on a complex, high-DOF humanoid robot. Its functionality is showcased by having our iCub humanoid robot pick-up objects from a table in front of it. An important feature is that the system can avoid obstacles - other objects detected in the visual stream - while reaching for the intended target object. Our integration also allows for non-static environments, i.e. the reaching is adapted on-the-fly from the visual feedback received, e.g. when an obstacle is moved into the trajectory. Furthermore we show that this system can be used both in autonomous and tele-operation scenarios.
Resumo:
Robustness to variations in environmental conditions and camera viewpoint is essential for long-term place recognition, navigation and SLAM. Existing systems typically solve either of these problems, but invariance to both remains a challenge. This paper presents a training-free approach to lateral viewpoint- and condition-invariant, vision-based place recognition. Our successive frame patch-tracking technique infers average scene depth along traverses and automatically rescales views of the same place at different depths to increase their similarity. We combine our system with the condition-invariant SMART algorithm and demonstrate place recognition between day and night, across entire 4-lane-plus-median-strip roads, where current algorithms fail.
Resumo:
This paper presents a novel vision-based underwater robotic system for the identification and control of Crown-Of-Thorns starfish (COTS) in coral reef environments. COTS have been identified as one of the most significant threats to Australia's Great Barrier Reef. These starfish literally eat coral, impacting large areas of reef and the marine ecosystem that depends on it. Evidence has suggested that land-based nutrient runoff has accelerated recent outbreaks of COTS requiring extensive use of divers to manually inject biological agents into the starfish in an attempt to control population numbers. Facilitating this control program using robotics is the goal of our research. In this paper we introduce a vision-based COTS detection and tracking system based on a Random Forest Classifier (RFC) trained on images from underwater footage. To track COTS with a moving camera, we embed the RFC in a particle filter detector and tracker where the predicted class probability of the RFC is used as an observation probability to weight the particles, and we use a sparse optical flow estimation for the prediction step of the filter. The system is experimentally evaluated in a realistic laboratory setup using a robotic arm that moves a camera at different speeds and heights over a range of real-size images of COTS in a reef environment.
Resumo:
This paper presents a system to analyze long field recordings with low signal-to-noise ratio (SNR) for bio-acoustic monitoring. A method based on spectral peak track, Shannon entropy, harmonic structure and oscillation structure is proposed to automatically detect anuran (frog) calling activity. Gaussian mixture model (GMM) is introduced for modelling those features. Four anuran species widespread in Queensland, Australia, are selected to evaluate the proposed system. A visualization method based on extracted indices is employed for detection of anuran calling activity which achieves high accuracy.