41 resultados para Object Segmentation
em Indian Institute of Science - Bangalore - Índia
Resumo:
In this paper, we propose a technique for video object segmentation using patch seams across frames. Typically, seams, which are connected paths of low energy, are utilised for retargeting, where the primary aim is to reduce the image size while preserving the salient image contents. Here, we adapt the formulation of seams for temporal label propagation. The energy function associated with the proposed video seams provides temporal linking of patches across frames, to accurately segment the object. The proposed energy function takes into account the similarity of patches along the seam, temporal consistency of motion and spatial coherency of seams. Label propagation is achieved with high fidelity in the critical boundary regions, utilising the proposed patch seams. To achieve this without additional overheads, we curtail the error propagation by formulating boundary regions as rough-sets. The proposed approach out-perform state-of-the-art supervised and unsupervised algorithms, on benchmark datasets.
Resumo:
Image and video analysis requires rich features that can characterize various aspects of visual information. These rich features are typically extracted from the pixel values of the images and videos, which require huge amount of computation and seldom useful for real-time analysis. On the contrary, the compressed domain analysis offers relevant information pertaining to the visual content in the form of transform coefficients, motion vectors, quantization steps, coded block patterns with minimal computational burden. The quantum of work done in compressed domain is relatively much less compared to pixel domain. This paper aims to survey various video analysis efforts published during the last decade across the spectrum of video compression standards. In this survey, we have included only the analysis part, excluding the processing aspect of compressed domain. This analysis spans through various computer vision applications such as moving object segmentation, human action recognition, indexing, retrieval, face detection, video classification and object tracking in compressed videos.
Resumo:
We present a motion detection algorithm which detects direction of motion at sufficient number of points and thus segregates the edge image into clusters of coherently moving points. Unlike most algorithms for motion analysis, we do not estimate magnitude of velocity vectors or obtain dense motion maps. The motivation is that motion direction information at a number of points seems to be sufficient to evoke perception of motion and hence should be useful in many image processing tasks requiring motion analysis. The algorithm essentially updates the motion at previous time using the current image frame as input in a dynamic fashion. One of the novel features of the algorithm is the use of some feedback mechanism for evidence segregation. This kind of motion analysis can identify regions in the image that are moving together coherently, and such information could be sufficient for many applications that utilize motion such as segmentation, compression, and tracking. We present an algorithm for tracking objects using our motion information to demonstrate the potential of this motion detection algorithm.
Resumo:
Medical image segmentation finds application in computer-aided diagnosis, computer-guided surgery, measuring tissue volumes, locating tumors, and pathologies. One approach to segmentation is to use active contours or snakes. Active contours start from an initialization (often manually specified) and are guided by image-dependent forces to the object boundary. Snakes may also be guided by gradient vector fields associated with an image. The first main result in this direction is that of Xu and Prince, who proposed the notion of gradient vector flow (GVF), which is computed iteratively. We propose a new formalism to compute the vector flow based on the notion of bilateral filtering of the gradient field associated with the edge map - we refer to it as the bilateral vector flow (BVF). The range kernel definition that we employ is different from the one employed in the standard Gaussian bilateral filter. The advantage of the BVF formalism is that smooth gradient vector flow fields with enhanced edge information can be computed noniteratively. The quality of image segmentation turned out to be on par with that obtained using the GVF and in some cases better than the GVF.
Resumo:
In this paper we present a segmentation algorithm to extract foreground object motion in a moving camera scenario without any preprocessing step such as tracking selected features, video alignment, or foreground segmentation. By viewing it as a curve fitting problem on advected particle trajectories, we use RANSAC to find the polynomial that best fits the camera motion and identify all trajectories that correspond to the camera motion. The remaining trajectories are those due to the foreground motion. By using the superposition principle, we subtract the motion due to camera from foreground trajectories and obtain the true object-induced trajectories. We show that our method performs on par with state-of-the-art technique, with an execution time speed-up of 10x-40x. We compare the results on real-world datasets such as UCF-ARG, UCF Sports and Liris-HARL. We further show that it can be used toper-form video alignment.
Resumo:
An iterative algorithm baaed on probabilistic estimation is described for obtaining the minimum-norm solution of a very large, consistent, linear system of equations AX = g where A is an (m times n) matrix with non-negative elements, x and g are respectively (n times 1) and (m times 1) vectors with positive components.
Resumo:
Visual tracking has been a challenging problem in computer vision over the decades. The applications of Visual Tracking are far-reaching, ranging from surveillance and monitoring to smart rooms. Mean-shift (MS) tracker, which gained more attention recently, is known for tracking objects in a cluttered environment and its low computational complexity. The major problem encountered in histogram-based MS is its inability to track rapidly moving objects. In order to track fast moving objects, we propose a new robust mean-shift tracker that uses both spatial similarity measure and color histogram-based similarity measure. The inability of MS tracker to handle large displacements is circumvented by the spatial similarity-based tracking module, which lacks robustness to object's appearance change. The performance of the proposed tracker is better than the individual trackers for tracking fast-moving objects with better accuracy.
Resumo:
We describe a novel method for human activity segmentation and interpretation in surveillance applications based on Gabor filter-bank features. A complex human activity is modeled as a sequence of elementary human actions like walking, running, jogging, boxing, hand-waving etc. Since human silhouette can be modeled by a set of rectangles, the elementary human actions can be modeled as a sequence of a set of rectangles with different orientations and scales. The activity segmentation is based on Gabor filter-bank features and normalized spectral clustering. The feature trajectories of an action category are learnt from training example videos using dynamic time warping. The combined segmentation and the recognition processes are very efficient as both the algorithms share the same framework and Gabor features computed for the former can be used for the later. We have also proposed a simple shadow detection technique to extract good silhouette which is necessary for good accuracy of an action recognition technique.
Resumo:
This paper describes a method of automated segmentation of speech assuming the signal is continuously time varying rather than the traditional short time stationary model. It has been shown that this representation gives comparable if not marginally better results than the other techniques for automated segmentation. A formulation of the 'Bach' (music semitonal) frequency scale filter-bank is proposed. A comparative study has been made of the performances using Mel, Bark and Bach scale filter banks considering this model. The preliminary results show up to 80 % matches within 20 ms of the manually segmented data, without any information of the content of the text and without any language dependence. 'Bach' filters are seen to marginally outperform the other filters.
Resumo:
This correspondence describes a method for automated segmentation of speech. The method proposed in this paper uses a specially designed filter-bank called Bach filter-bank which makes use of 'music' related perception criteria. The speech signal is treated as continuously time varying signal as against a short time stationary model. A comparative study has been made of the performances using Mel, Bark and Bach scale filter banks. The preliminary results show up to 80 % matches within 20 ms of the manually segmented data, without any information of the content of the text and without any language dependence. The Bach filters are seen to marginally outperform the other filters.
Resumo:
During lightning strike to a tall grounded object (TGO), reflections of current waves are known to occur at either ends of the TGO. These reflection modify the channel current and hence, the lightning electromagnetic fields. This study aims to identify the possible contributing factors to reflection at a TGO-channel junction for the current waves ascending on the TGO. Possible sources of reflection identified are corona sheath and discontinuity of resistance and radius. For analyzing the contribution of corona sheath and discontinuity of resistance at the junction, a macroscopic physical model for the return stroke developed in our earlier work is employed. NEC-2D is used for assessing the contribution of abrupt change in radii at a TGO-channel junction. The wire-cage model adopted for the same is validated using laboratory experiments. Detailed investigation revealed the following. The main contributor for reflection at a TGO-channel junction is the difference between TGO and channel core radii. Also, the discontinuity of resistance at a TGO-channel junction can be of some relevance only for the first microsecond regime. Further, corona sheath does not play any significant role in the reflection.
Resumo:
We introduce a novel temporal feature of a signal, namely extrema-based signal track length (ESTL) for the problem of speech segmentation. We show that ESTL measure is sensitive to both amplitude and frequency of the signal. The short-time ESTL (ST_ESTL) shows a promising way to capture the significant segments of speech signal, where the segments correspond to acoustic units of speech having distinct temporal waveforms. We compare ESTL based segmentation with ML and STM methods and find that it is as good as spectral feature based segmentation, but with lesser computational complexity.