372 resultados para Image processing, computer-assisted


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Deep convolutional network models have dominated recent work in human action recognition as well as image classification. However, these methods are often unduly influenced by the image background, learning and exploiting the presence of cues in typical computer vision datasets. For unbiased robotics applications, the degree of variation and novelty in action backgrounds is far greater than in computer vision datasets. To address this challenge, we propose an “action region proposal” method that, informed by optical flow, extracts image regions likely to contain actions for input into the network both during training and testing. In a range of experiments, we demonstrate that manually segmenting the background is not enough; but through active action region proposals during training and testing, state-of-the-art or better performance can be achieved on individual spatial and temporal video components. Finally, we show by focusing attention through action region proposals, we can further improve upon the existing state-of-the-art in spatio-temporally fused action recognition performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Clustering identities in a video is a useful task to aid in video search, annotation and retrieval, and cast identification. However, reliably clustering faces across multiple videos is challenging task due to variations in the appearance of the faces, as videos are captured in an uncontrolled environment. A person's appearance may vary due to session variations including: lighting and background changes, occlusions, changes in expression and make up. In this paper we propose the novel Local Total Variability Modelling (Local TVM) approach to cluster faces across a news video corpus; and incorporate this into a novel two stage video clustering system. We first cluster faces within a single video using colour, spatial and temporal cues; after which we use face track modelling and hierarchical agglomerative clustering to cluster faces across the entire corpus. We compare different face recognition approaches within this framework. Experiments on a news video database show that the Local TVM technique is able effectively model the session variation observed in the data, resulting in improved clustering performance, with much greater computational efficiency than other methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we investigate the effectiveness of class specific sparse codes in the context of discriminative action classification. The bag-of-words representation is widely used in activity recognition to encode features, and although it yields state-of-the art performance with several feature descriptors it still suffers from large quantization errors and reduces the overall performance. Recently proposed sparse representation methods have been shown to effectively represent features as a linear combination of an over complete dictionary by minimizing the reconstruction error. In contrast to most of the sparse representation methods which focus on Sparse-Reconstruction based Classification (SRC), this paper focuses on a discriminative classification using a SVM by constructing class-specific sparse codes for motion and appearance separately. Experimental results demonstrates that separate motion and appearance specific sparse coefficients provide the most effective and discriminative representation for each class compared to a single class-specific sparse coefficients.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In competitive combat sporting environments like boxing, the statistics on a boxer's performance, including the amount and type of punches thrown, provide a valuable source of data and feedback which is routinely used for coaching and performance improvement purposes. This paper presents a robust framework for the automatic classification of a boxer's punches. Overhead depth imagery is employed to alleviate challenges associated with occlusions, and robust body-part tracking is developed for the noisy time-of-flight sensors. Punch recognition is addressed through both a multi-class SVM and Random Forest classifiers. A coarse-to-fine hierarchical SVM classifier is presented based on prior knowledge of boxing punches. This framework has been applied to shadow boxing image sequences taken at the Australian Institute of Sport with 8 elite boxers. Results demonstrate the effectiveness of the proposed approach, with the hierarchical SVM classifier yielding a 96% accuracy, signifying its suitability for analysing athletes punches in boxing bouts.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a novel crop detection system applied to the challenging task of field sweet pepper (capsicum) detection. The field-grown sweet pepper crop presents several challenges for robotic systems such as the high degree of occlusion and the fact that the crop can have a similar colour to the background (green on green). To overcome these issues, we propose a two-stage system that performs per-pixel segmentation followed by region detection. The output of the segmentation is used to search for highly probable regions and declares these to be sweet pepper. We propose the novel use of the local binary pattern (LBP) to perform crop segmentation. This feature improves the accuracy of crop segmentation from an AUC of 0.10, for previously proposed features, to 0.56. Using the LBP feature as the basis for our two-stage algorithm, we are able to detect 69.2% of field grown sweet peppers in three sites. This is an impressive result given that the average detection accuracy of people viewing the same colour imagery is 66.8%.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In many parts of the world, uncontrolled fires in sparsely populated areas are a major concern as they can quickly grow into large and destructive conflagrations in short time spans. Detecting these fires has traditionally been a job for trained humans on the ground, or in the air. In many cases, these manned solutions are simply not able to survey the amount of area necessary to maintain sufficient vigilance and coverage. This paper investigates the use of unmanned aerial systems (UAS) for automated wildfire detection. The proposed system uses low-cost, consumer-grade electronics and sensors combined with various airframes to create a system suitable for automatic detection of wildfires. The system employs automatic image processing techniques to analyze captured images and autonomously detect fire-related features such as fire lines, burnt regions, and flammable material. This image recognition algorithm is designed to cope with environmental occlusions such as shadows, smoke and obstructions. Once the fire is identified and classified, it is used to initialize a spatial/temporal fire simulation. This simulation is based on occupancy maps whose fidelity can be varied to include stochastic elements, various types of vegetation, weather conditions, and unique terrain. The simulations can be used to predict the effects of optimized firefighting methods to prevent the future propagation of the fires and greatly reduce time to detection of wildfires, thereby greatly minimizing the ensuing damage. This paper also documents experimental flight tests using a SenseFly Swinglet UAS conducted in Brisbane, Australia as well as modifications for custom UAS.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Detect and Avoid (DAA) technology is widely acknowledged as a critical enabler for unsegregated Remote Piloted Aircraft (RPA) operations, particularly Beyond Visual Line of Sight (BVLOS). Image-based DAA, in the visible spectrum, is a promising technological option for addressing the challenges DAA presents. Two impediments to progress for this approach are the scarcity of available video footage to train and test algorithms, in conjunction with testing regimes and specifications which facilitate repeatable, statistically valid, performance assessment. This paper includes three key contributions undertaken to address these impediments. In the first instance, we detail our progress towards the creation of a large hybrid collision and near-collision encounter database. Second, we explore the suitability of techniques employed by the biometric research community (Speaker Verification and Language Identification), for DAA performance optimisation and assessment. These techniques include Detection Error Trade-off (DET) curves, Equal Error Rates (EER), and the Detection Cost Function (DCF). Finally, the hybrid database and the speech-based techniques are combined and employed in the assessment of a contemporary, image based DAA system. This system includes stabilisation, morphological filtering and a Hidden Markov Model (HMM) temporal filter.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Agricultural pests are responsible for millions of dollars in crop losses and management costs every year. In order to implement optimal site-specific treatments and reduce control costs, new methods to accurately monitor and assess pest damage need to be investigated. In this paper we explore the combination of unmanned aerial vehicles (UAV), remote sensing and machine learning techniques as a promising technology to address this challenge. The deployment of UAVs as a sensor platform is a rapidly growing field of study for biosecurity and precision agriculture applications. In this experiment, a data collection campaign is performed over a sorghum crop severely damaged by white grubs (Coleoptera: Scarabaeidae). The larvae of these scarab beetles feed on the roots of plants, which in turn impairs root exploration of the soil profile. In the field, crop health status could be classified according to three levels: bare soil where plants were decimated, transition zones of reduced plant density and healthy canopy areas. In this study, we describe the UAV platform deployed to collect high-resolution RGB imagery as well as the image processing pipeline implemented to create an orthoimage. An unsupervised machine learning approach is formulated in order to create a meaningful partition of the image into each of the crop levels. The aim of the approach is to simplify the image analysis step by minimizing user input requirements and avoiding the manual data labeling necessary in supervised learning approaches. The implemented algorithm is based on the K-means clustering algorithm. In order to control high-frequency components present in the feature space, a neighbourhood-oriented parameter is introduced by applying Gaussian convolution kernels prior to K-means. The outcome of this approach is a soft K-means algorithm similar to the EM algorithm for Gaussian mixture models. The results show the algorithm delivers decision boundaries that consistently classify the field into three clusters, one for each crop health level. The methodology presented in this paper represents a venue for further research towards automated crop damage assessments and biosecurity surveillance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Scene understanding has been investigated from a mainly visual information point of view. Recently depth has been provided an extra wealth of information, allowing more geometric knowledge to fuse into scene understanding. Yet to form a holistic view, especially in robotic applications, one can create even more data by interacting with the world. In fact humans, when growing up, seem to heavily investigate the world around them by haptic exploration. We show an application of haptic exploration on a humanoid robot in cooperation with a learning method for object segmentation. The actions performed consecutively improve the segmentation of objects in the scene.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Deep convolutional neural networks (DCNNs) have been employed in many computer vision tasks with great success due to their robustness in feature learning. One of the advantages of DCNNs is their representation robustness to object locations, which is useful for object recognition tasks. However, this also discards spatial information, which is useful when dealing with topological information of the image (e.g. scene labeling, face recognition). In this paper, we propose a deeper and wider network architecture to tackle the scene labeling task. The depth is achieved by incorporating predictions from multiple early layers of the DCNN. The width is achieved by combining multiple outputs of the network. We then further refine the parsing task by adopting graphical models (GMs) as a post-processing step to incorporate spatial and contextual information into the network. The new strategy for a deeper, wider convolutional network coupled with graphical models has shown promising results on the PASCAL-Context dataset.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper addresses the challenges of flood mapping using multispectral images. Quantitative flood mapping is critical for flood damage assessment and management. Remote sensing images obtained from various satellite or airborne sensors provide valuable data for this application, from which the information on the extent of flood can be extracted. However the great challenge involved in the data interpretation is to achieve more reliable flood extent mapping including both the fully inundated areas and the 'wet' areas where trees and houses are partly covered by water. This is a typical combined pure pixel and mixed pixel problem. In this paper, an extended Support Vector Machines method for spectral unmixing developed recently has been applied to generate an integrated map showing both pure pixels (fully inundated areas) and mixed pixels (trees and houses partly covered by water). The outputs were compared with the conventional mean based linear spectral mixture model, and better performance was demonstrated with a subset of Landsat ETM+ data recorded at the Daly River Basin, NT, Australia, on 3rd March, 2008, after a flood event.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The most difficult operation in the flood inundation mapping using optical flood images is to separate fully inundated areas from the ‘wet’ areas where trees and houses are partly covered by water. This can be referred as a typical problem the presence of mixed pixels in the images. A number of automatic information extraction image classification algorithms have been developed over the years for flood mapping using optical remote sensing images. Most classification algorithms generally, help in selecting a pixel in a particular class label with the greatest likelihood. However, these hard classification methods often fail to generate a reliable flood inundation mapping because the presence of mixed pixels in the images. To solve the mixed pixel problem advanced image processing techniques are adopted and Linear Spectral unmixing method is one of the most popular soft classification technique used for mixed pixel analysis. The good performance of linear spectral unmixing depends on two important issues, those are, the method of selecting endmembers and the method to model the endmembers for unmixing. This paper presents an improvement in the adaptive selection of endmember subset for each pixel in spectral unmixing method for reliable flood mapping. Using a fixed set of endmembers for spectral unmixing all pixels in an entire image might cause over estimation of the endmember spectra residing in a mixed pixel and hence cause reducing the performance level of spectral unmixing. Compared to this, application of estimated adaptive subset of endmembers for each pixel can decrease the residual error in unmixing results and provide a reliable output. In this current paper, it has also been proved that this proposed method can improve the accuracy of conventional linear unmixing methods and also easy to apply. Three different linear spectral unmixing methods were applied to test the improvement in unmixing results. Experiments were conducted in three different sets of Landsat-5 TM images of three different flood events in Australia to examine the method on different flooding conditions and achieved satisfactory outcomes in flood mapping.