315 resultados para descriptor
Resumo:
Robust descriptor matching across varying lighting conditions is important for vision-based robotics. We present a novel strategy for quantifying the lighting variance of descriptors. The strategy works by utilising recovered low dimensional mappings from Isomap and our measure of the lighting variance of each of these mappings. The resultant metric allows different descriptors to be compared given a dataset and a set of keypoints. We demonstrate that the SIFT descriptor typically has lower lighting variance than other descriptors, although the result depends on semantic class and lighting conditions.
Resumo:
Whole-image descriptors such as GIST have been used successfully for persistent place recognition when combined with temporal filtering or sequential filtering techniques. However, whole-image descriptor localization systems often apply a heuristic rather than a probabilistic approach to place recognition, requiring substantial environmental-specific tuning prior to deployment. In this paper we present a novel online solution that uses statistical approaches to calculate place recognition likelihoods for whole-image descriptors, without requiring either environmental tuning or pre-training. Using a real world benchmark dataset, we show that this method creates distributions appropriate to a specific environment in an online manner. Our method performs comparably to FAB-MAP in raw place recognition performance, and integrates into a state of the art probabilistic mapping system to provide superior performance to whole-image methods that are not based on true probability distributions. The method provides a principled means for combining the powerful change-invariant properties of whole-image descriptors with probabilistic back-end mapping systems without the need for prior training or system tuning.
Resumo:
Whole image descriptors have recently been shown to be remarkably robust to perceptual change especially compared to local features. However, whole-image-based localization systems typically rely on heuristic methods for determining appropriate matching thresholds in a particular environment. These environment-specific tuning requirements and the lack of a meaningful interpretation of these arbitrary thresholds limits the general applicability of these systems. In this paper we present a Bayesian model of probability for whole-image descriptors that can be seamlessly integrated into localization systems designed for probabilistic visual input. We demonstrate this method using CAT-Graph, an appearance-based visual localization system originally designed for a FAB-MAP-style probabilistic input. We show that using whole-image descriptors as visual input extends CAT-Graph’s functionality to environments that experience a greater amount of perceptual change. We also present a method of estimating whole-image probability models in an online manner, removing the need for a prior training phase. We show that this online, automated training method can perform comparably to pre-trained, manually tuned local descriptor methods.
Resumo:
This paper describes a novel obstacle detection system for autonomous robots in agricultural field environments that uses a novelty detector to inform stereo matching. Stereo vision alone erroneously detects obstacles in environments with ambiguous appearance and ground plane such as in broad-acre crop fields with harvested crop residue. The novelty detector estimates the probability density in image descriptor space and incorporates image-space positional understanding to identify potential regions for obstacle detection using dense stereo matching. The results demonstrate that the system is able to detect obstacles typical to a farm at day and night. This system was successfully used as the sole means of obstacle detection for an autonomous robot performing a long term two hour coverage task travelling 8.5 km.
Resumo:
Public Transport Travel Time Variability (PTTV) is essential for understanding the deteriorations in the reliability of travel time, optimizing transit schedules and route choices. This paper establishes the key definitions of PTTV in which firstly include all buses, and secondly include only a single service from a bus route. The paper then analyzes the day-to-day distribution of public transport travel time by using Transit Signal Priority data. A comprehensive approach, using both parametric bootstrapping Kolmogorov-Smirnov test and Bayesian Information Creation technique is developed, recommends Lognormal distribution as the best descriptor of bus travel time on urban corridors. The probability density function of Lognormal distribution is finally used for calculating probability indicators of PTTV. The findings of this study are useful for both traffic managers and statisticians for planning and analyzing the transit systems.
Resumo:
Protocols for bioassessment often relate changes in summary metrics that describe aspects of biotic assemblage structure and function to environmental stress. Biotic assessment using multimetric indices now forms the basis for setting regulatory standards for stream quality and a range of other goals related to water resource management in the USA and elsewhere. Biotic metrics are typically interpreted with reference to the expected natural state to evaluate whether a site is degraded. It is critical that natural variation in biotic metrics along environmental gradients is adequately accounted for, in order to quantify human disturbance-induced change. A common approach used in the IBI is to examine scatter plots of variation in a given metric along a single stream size surrogate and a fit a line (drawn by eye) to form the upper bound, and hence define the maximum likely value of a given metric in a site of a given environmental characteristic (termed the 'maximum species richness line' - MSRL). In this paper we examine whether the use of a single environmental descriptor and the MSRL is appropriate for defining the reference condition for a biotic metric (fish species richness) and for detecting human disturbance gradients in rivers of south-eastern Queensland, Australia. We compare the accuracy and precision of the MSRL approach based on single environmental predictors, with three regression-based prediction methods (Simple Linear Regression, Generalised Linear Modelling and Regression Tree modelling) that use (either singly or in combination) a set of landscape and local scale environmental variables as predictors of species richness. We compared the frequency of classification errors from each method against set biocriteria and contrast the ability of each method to accurately reflect human disturbance gradients at a large set of test sites. The results of this study suggest that the MSRL based upon variation in a single environmental descriptor could not accurately predict species richness at minimally disturbed sites when compared with SLR's based on equivalent environmental variables. Regression-based modelling incorporating multiple environmental variables as predictors more accurately explained natural variation in species richness than did simple models using single environmental predictors. Prediction error arising from the MSRL was substantially higher than for the regression methods and led to an increased frequency of Type I errors (incorrectly classing a site as disturbed). We suggest that problems with the MSRL arise from the inherent scoring procedure used and that it is limited to predicting variation in the dependent variable along a single environmental gradient.
Resumo:
Due to the popularity of security cameras in public places, it is of interest to design an intelligent system that can efficiently detect events automatically. This paper proposes a novel algorithm for multi-person event detection. To ensure greater than real-time performance, features are extracted directly from compressed MPEG video. A novel histogram-based feature descriptor that captures the angles between extracted particle trajectories is proposed, which allows us to capture motion patterns of multi-person events in the video. To alleviate the need for fine-grained annotation, we propose the use of Labelled Latent Dirichlet Allocation, a “weakly supervised” method that allows the use of coarse temporal annotations which are much simpler to obtain. This novel system is able to run at approximately ten times real-time, while preserving state-of-theart detection performance for multi-person events on a 100-hour real-world surveillance dataset (TRECVid SED).
Resumo:
To the trained-eye, experts can often identify a team based on their unique style of play due to their movement, passing and interactions. In this paper, we present a method which can accurately determine the identity of a team from spatiotemporal player tracking data. We do this by utilizing a formation descriptor which is found by minimizing the entropy of role-specific occupancy maps. We show how our approach is significantly better at identifying different teams compared to standard measures (i.e., shots, passes etc.). We demonstrate the utility of our approach using an entire season of Prozone player tracking data from a top-tier professional soccer league.
Resumo:
This paper outlines the approach taken by the Speech, Audio, Image and Video Technologies laboratory, and the Applied Data Mining Research Group (SAIVT-ADMRG) in the 2014 MediaEval Social Event Detection (SED) task. We participated in the event based clustering subtask (subtask 1), and focused on investigating the incorporation of image features as another source of data to aid clustering. In particular, we developed a descriptor based around the use of super-pixel segmentation, that allows a low dimensional feature that incorporates both colour and texture information to be extracted and used within the popular bag-of-visual-words (BoVW) approach.
Resumo:
The impact of simulation methods for social research in the Information Systems (IS) research field remains low. A concern is our field is inadequately leveraging the unique strengths of simulation methods. Although this low impact is frequently attributed to methodological complexity, we offer an alternative explanation – the poor construction of research value. We argue a more intuitive value construction, better connected to the knowledge base, will facilitate increased value and broader appreciation. Meta-analysis of studies published in IS journals over the last decade evidences the low impact. To facilitate value construction, we synthesize four common types of simulation research contribution: Analyzer, Tester, Descriptor, and Theorizer. To illustrate, we employ the proposed typology to describe how each type of value is structured in simulation research and connect each type to instances from IS literature, thereby making these value types and their construction visible and readily accessible to the general IS community.
Resumo:
In the current regulatory climate, there is increasing expectation that law schools will be able to demonstrate students’ acquisition of learning outcomes regarding collaboration skills. We argue that this is best achieved through a stepped and structured whole-of-curriculum approach to small group learning. ‘Group work’ provides deep learning and opportunities to develop professional skills, but these benefits are not always realised for law students. An issue is that what is meant by ‘group work’ is not always clear, resulting in a learning regime that may not support the attainment of desired outcomes. This paper describes different types of ‘group work', each associated with distinct learning outcomes. It suggests that ‘group work’ as an umbrella term to describe these types is confusing, as it provides little indication to students and teachers of the type of learning that is valued and is expected to take place. ‘Small group learning’ is a preferable general descriptor. Identifying different types of small group learning allows law schools to develop and demonstrate a scaffolded, sequential and incremental approach to fostering law students’ collaboration skills. To support learning and the acquisition of higherorder skills, different types of small group learning are more appropriate at certain stages of the program. This structured approach is consistent with social cognitive theory, which suggests that with the guidance of a supportive teacher, students can develop skills and confidence in one type of activity which then enhances motivation to participate in another.
Resumo:
The mean shift tracker has achieved great success in visual object tracking due to its efficiency being nonparametric. However, it is still difficult for the tracker to handle scale changes of the object. In this paper, we associate a scale adaptive approach with the mean shift tracker. Firstly, the target in the current frame is located by the mean shift tracker. Then, a feature point matching procedure is employed to get the matched pairs of the feature point between target regions in the current frame and the previous frame. We employ FAST-9 corner detector and HOG descriptor for the feature matching. Finally, with the acquired matched pairs of the feature point, the affine transformation between target regions in the two frames is solved to obtain the current scale of the target. Experimental results show that the proposed tracker gives satisfying results when the scale of the target changes, with a good performance of efficiency.
Resumo:
We propose the use of optical flow information as a method for detecting and describing changes in the environment, from the perspective of a mobile camera. We analyze the characteristics of the optical flow signal and demonstrate how robust flow vectors can be generated and used for the detection of depth discontinuities and appearance changes at key locations. To successfully achieve this task, a full discussion on camera positioning, distortion compensation, noise filtering, and parameter estimation is presented. We then extract statistical attributes from the flow signal to describe the location of the scene changes. We also employ clustering and dominant shape of vectors to increase the descriptiveness. Once a database of nodes (where a node is a detected scene change) and their corresponding flow features is created, matching can be performed whenever nodes are encountered, such that topological localization can be achieved. We retrieve the most likely node according to the Mahalanobis and Chi-square distances between the current frame and the database. The results illustrate the applicability of the technique for detecting and describing scene changes in diverse lighting conditions, considering indoor and outdoor environments and different robot platforms.
Resumo:
Bioacoustic monitoring has become a significant research topic for species diversity conservation. Due to the development of sensing techniques, acoustic sensors are widely deployed in the field to record animal sounds over a large spatial and temporal scale. With large volumes of collected audio data, it is essential to develop semi-automatic or automatic techniques to analyse the data. This can help ecologists make decisions on how to protect and promote the species diversity. This paper presents generic features to characterize a range of bird species for vocalisation retrieval. In the implementation, audio recordings are first converted to spectrograms using short-time Fourier transform, then a ridge detection method is applied to the spectrogram for detecting points of interest. Based on the detected points, a new region representation are explored for describing various bird vocalisations and a local descriptor including temporal entropy, frequency bin entropy and histogram of counts of four ridge directions is calculated for each sub-region. To speed up the retrieval process, indexing is carried out and the retrieved results are ranked according to similarity scores. The experiment results show that our proposed feature set can achieve 0.71 in term of retrieval success rate which outperforms spectral ridge features alone (0.55) and Mel frequency cepstral coefficients (0.36).
Resumo:
In the field of face recognition, sparse representation (SR) has received considerable attention during the past few years, with a focus on holistic descriptors in closed-set identification applications. The underlying assumption in such SR-based methods is that each class in the gallery has sufficient samples and the query lies on the subspace spanned by the gallery of the same class. Unfortunately, such an assumption is easily violated in the face verification scenario, where the task is to determine if two faces (where one or both have not been seen before) belong to the same person. In this study, the authors propose an alternative approach to SR-based face verification, where SR encoding is performed on local image patches rather than the entire face. The obtained sparse signals are pooled via averaging to form multiple region descriptors, which then form an overall face descriptor. Owing to the deliberate loss of spatial relations within each region (caused by averaging), the resulting descriptor is robust to misalignment and various image deformations. Within the proposed framework, they evaluate several SR encoding techniques: l1-minimisation, Sparse Autoencoder Neural Network (SANN) and an implicit probabilistic technique based on Gaussian mixture models. Thorough experiments on AR, FERET, exYaleB, BANCA and ChokePoint datasets show that the local SR approach obtains considerably better and more robust performance than several previous state-of-the-art holistic SR methods, on both the traditional closed-set identification task and the more applicable face verification task. The experiments also show that l1-minimisation-based encoding has a considerably higher computational cost when compared with SANN-based and probabilistic encoding, but leads to higher recognition rates.