132 resultados para Foreground object segmentation
Resumo:
This paper presents a novel way to speed up the evaluation time of a boosting classifier. We make a shallow (flat) network deep (hierarchical) by growing a tree from decision regions of a given boosting classifier. The tree provides many short paths for speeding up while preserving the reasonably smooth decision regions of the boosting classifier for good generalisation. For converting a boosting classifier into a decision tree, we formulate a Boolean optimization problem, which has been previously studied for circuit design but limited to a small number of binary variables. In this work, a novel optimisation method is proposed for, firstly, several tens of variables i.e. weak-learners of a boosting classifier, and then any larger number of weak-learners by using a two-stage cascade. Experiments on the synthetic and face image data sets show that the obtained tree achieves a significant speed up both over a standard boosting classifier and the Fast-exit-a previously described method for speeding-up boosting classification, at the same accuracy. The proposed method as a general meta-algorithm is also useful for a boosting cascade, where it speeds up individual stage classifiers by different gains. The proposed method is further demonstrated for fast-moving object tracking and segmentation problems. © 2011 Springer Science+Business Media, LLC.
Resumo:
Visual recognition problems often involve classification of myriads of pixels, across scales, to locate objects of interest in an image or to segment images according to object classes. The requirement for high speed and accuracy makes the problems very challenging and has motivated studies on efficient classification algorithms. A novel multi-classifier boosting algorithm is proposed to tackle the multimodal problems by simultaneously clustering samples and boosting classifiers in Section 2. The method is extended into an online version for object tracking in Section 3. Section 4 presents a tree-structured classifier, called Super tree, to further speed up the classification time of a standard boosting classifier. The proposed methods are demonstrated for object detection, tracking and segmentation tasks. © 2013 Springer-Verlag Berlin Heidelberg.
Resumo:
We present algorithms for tracking and reasoning of local traits in the subsystem level based on the observed emergent behavior of multiple coordinated groups in potentially cluttered environments. Our proposed Bayesian inference schemes, which are primarily based on (Markov chain) Monte Carlo sequential methods, include: 1) an evolving network-based multiple object tracking algorithm that is capable of categorizing objects into groups, 2) a multiple cluster tracking algorithm for dealing with prohibitively large number of objects, and 3) a causality inference framework for identifying dominant agents based exclusively on their observed trajectories.We use these as building blocks for developing a unified tracking and behavioral reasoning paradigm. Both synthetic and realistic examples are provided for demonstrating the derived concepts. © 2013 Springer-Verlag Berlin Heidelberg.
Resumo:
A common approach to visualise multidimensional data sets is to map every data dimension to a separate visual feature. It is generally assumed that such visual features can be judged independently from each other. However, we have recently shown that interactions between features do exist [Hannus et al. 2004; van den Berg et al. 2005]. In those studies, we first determined individual colour and size contrast or colour and orientation contrast necessary to achieve a fixed level of discrimination performance in single feature search tasks. These contrasts were then used in a conjunction search task in which the target was defined by a combination of a colour and a size or a colour and an orientation. We found that in conjunction search, despite the matched feature discriminability, subjects significantly more often chose an item with the correct colour than one with correct size or orientation. This finding may have consequences for visualisation: the saliency of information coded by objects' size or orientation may change when there is a need to simultaneously search for colour that codes another aspect of the information. In the present experiment, we studied whether a colour bias can also be found in a more complex and continuous task, Subjects had to search for a target in a node-link diagram consisting of SO nodes, while their eye movements were being tracked, Each node was assigned a random colour and size (from a range of 10 possible values with fixed perceptual distances). We found that when we base the distances on the mean threshold contrasts that were determined in our previous experiments, the fixated nodes tend to resemble the target colour more than the target size (Figure 1a). This indicates that despite the perceptual matching, colour is judged with greater precision than size during conjunction search. We also found that when we double the size contrast (i.e. the distances between the 10 possible node sizes), this effect disappears (Figure 1b). Our findings confirm that the previously found decrease in salience of other features during colour conjunction search is also present in more complex (more 'visualisation- realistic') visual search tasks. The asymmetry in visual search behaviour can be compensated for by manipulating step sizes (perceptual distances) within feature dimensions. Our results therefore also imply that feature hierarchies are not completely fixed and may be adapted to the requirements of a particular visualisation. Copyright © 2005 by the Association for Computing Machinery, Inc.
Resumo:
The visual system must learn to infer the presence of objects and features in the world from the images it encounters, and as such it must, either implicitly or explicitly, model the way these elements interact to create the image. Do the response properties of cells in the mammalian visual system reflect this constraint? To address this question, we constructed a probabilistic model in which the identity and attributes of simple visual elements were represented explicitly and learnt the parameters of this model from unparsed, natural video sequences. After learning, the behaviour and grouping of variables in the probabilistic model corresponded closely to functional and anatomical properties of simple and complex cells in the primary visual cortex (V1). In particular, feature identity variables were activated in a way that resembled the activity of complex cells, while feature attribute variables responded much like simple cells. Furthermore, the grouping of the attributes within the model closely parallelled the reported anatomical grouping of simple cells in cat V1. Thus, this generative model makes explicit an interpretation of complex and simple cells as elements in the segmentation of a visual scene into basic independent features, along with a parametrisation of their moment-by-moment appearances. We speculate that such a segmentation may form the initial stage of a hierarchical system that progressively separates the identity and appearance of more articulated visual elements, culminating in view-invariant object recognition.
Resumo:
Due to its importance, video segmentation has regained interest recently. However, there is no common agreement about the necessary ingredients for best performance. This work contributes a thorough analysis of various within- and between-frame affinities suitable for video segmentation. Our results show that a frame-based superpixel segmentation combined with a few motion and appearance-based affinities are sufficient to obtain good video segmentation performance. A second contribution of the paper is the extension of [1] to include motion-cues, which makes the algorithm globally aware of motion, thus improving its performance for video sequences. Finally, we contribute an extension of an established image segmentation benchmark [1] to videos, allowing coarse-to-fine video segmentations and multiple human annotations. Our results are tested on BMDS [2], and compared to existing methods. © 2013 Springer-Verlag.
Resumo:
We present Multi Scale Shape Index (MSSI), a novel feature for 3D object recognition. Inspired by the scale space filtering theory and Shape Index measure proposed by Koenderink & Van Doorn [6], this feature associates different forms of shape, such as umbilics, saddle regions, parabolic regions to a real valued index. This association is useful for representing an object based on its constituent shape forms. We derive closed form scale space equations which computes a characteristic scale at each 3D point in a point cloud without an explicit mesh structure. This characteristic scale is then used to estimate the Shape Index. We quantitatively evaluate the robustness and repeatability of the MSSI feature for varying object scales and changing point cloud density. We also quantify the performance of MSSI for object category recognition on a publicly available dataset. © 2013 Springer-Verlag.
Resumo:
We present a novel mixture of trees (MoT) graphical model for video segmentation. Each component in this mixture represents a tree structured temporal linkage between super-pixels from the first to the last frame of a video sequence. Our time-series model explicitly captures the uncertainty in temporal linkage between adjacent frames which improves segmentation accuracy. We provide a variational inference scheme for this model to estimate super-pixel labels and their confidences in nearly realtime. The efficacy of our approach is demonstrated via quantitative comparisons on the challenging SegTrack joint segmentation and tracking dataset [23].
Resumo:
This paper addresses the basic problem of recovering the 3D surface of an object that is observed in motion by a single camera and under a static but unknown lighting condition. We propose a method to establish pixelwise correspondence between input images by way of depth search by investigating optimal subsets of intensities rather than employing all the relevant pixel values. The thrust of our algorithm is that it is capable of dealing with specularities which appear on the top of shading variance that is caused due to object motion. This is in terms of both stages of finding sparse point correspondence and dense depth search. We also propose that a linearised image basis can be directly computed by the procudure of finding the correspondence. We illustrate the performance of the theoretical propositions using images of real objects. © 2009. The copyright of this document resides with its authors.