57 resultados para Segmentation

em Indian Institute of Science - Bangalore - Índia


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We describe a novel method for human activity segmentation and interpretation in surveillance applications based on Gabor filter-bank features. A complex human activity is modeled as a sequence of elementary human actions like walking, running, jogging, boxing, hand-waving etc. Since human silhouette can be modeled by a set of rectangles, the elementary human actions can be modeled as a sequence of a set of rectangles with different orientations and scales. The activity segmentation is based on Gabor filter-bank features and normalized spectral clustering. The feature trajectories of an action category are learnt from training example videos using dynamic time warping. The combined segmentation and the recognition processes are very efficient as both the algorithms share the same framework and Gabor features computed for the former can be used for the later. We have also proposed a simple shadow detection technique to extract good silhouette which is necessary for good accuracy of an action recognition technique.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper describes a method of automated segmentation of speech assuming the signal is continuously time varying rather than the traditional short time stationary model. It has been shown that this representation gives comparable if not marginally better results than the other techniques for automated segmentation. A formulation of the 'Bach' (music semitonal) frequency scale filter-bank is proposed. A comparative study has been made of the performances using Mel, Bark and Bach scale filter banks considering this model. The preliminary results show up to 80 % matches within 20 ms of the manually segmented data, without any information of the content of the text and without any language dependence. 'Bach' filters are seen to marginally outperform the other filters.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This correspondence describes a method for automated segmentation of speech. The method proposed in this paper uses a specially designed filter-bank called Bach filter-bank which makes use of 'music' related perception criteria. The speech signal is treated as continuously time varying signal as against a short time stationary model. A comparative study has been made of the performances using Mel, Bark and Bach scale filter banks. The preliminary results show up to 80 % matches within 20 ms of the manually segmented data, without any information of the content of the text and without any language dependence. The Bach filters are seen to marginally outperform the other filters.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We introduce a novel temporal feature of a signal, namely extrema-based signal track length (ESTL) for the problem of speech segmentation. We show that ESTL measure is sensitive to both amplitude and frequency of the signal. The short-time ESTL (ST_ESTL) shows a promising way to capture the significant segments of speech signal, where the segments correspond to acoustic units of speech having distinct temporal waveforms. We compare ESTL based segmentation with ML and STM methods and find that it is as good as spectral feature based segmentation, but with lesser computational complexity.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Image segmentation is formulated as a stochastic process whose invariant distribution is concentrated at points of the desired region. By choosing multiple seed points, different regions can be segmented. The algorithm is based on the theory of time-homogeneous Markov chains and has been largely motivated by the technique of simulated annealing. The method proposed here has been found to perform well on real-world clean as well as noisy images while being computationally far less expensive than stochastic optimisation techniques

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper discusses an approach for river mapping and flood evaluation based on multi-temporal time series analysis of satellite images utilizing pixel spectral information for image classification and region-based segmentation for extracting water-covered regions. Analysis of MODIS satellite images is applied in three stages: before flood, during flood and after flood. Water regions are extracted from the MODIS images using image classification (based on spectral information) and image segmentation (based on spatial information). Multi-temporal MODIS images from ``normal'' (non-flood) and flood time-periods are processed in two steps. In the first step, image classifiers such as Support Vector Machines (SVMs) and Artificial Neural Networks (ANNs) separate the image pixels into water and non-water groups based on their spectral features. The classified image is then segmented using spatial features of the water pixels to remove the misclassified water. From the results obtained, we evaluate the performance of the method and conclude that the use of image classification (SVM and ANN) and region-based image segmentation is an accurate and reliable approach for the extraction of water-covered regions. (c) 2012 COSPAR. Published by Elsevier Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper discusses an approach for river mapping and flood evaluation based on multi-temporal time-series analysis of satellite images utilizing pixel spectral information for image clustering and region based segmentation for extracting water covered regions. MODIS satellite images are analyzed at two stages: before flood and during flood. Multi-temporal MODIS images are processed in two steps. In the first step, clustering algorithms such as Genetic Algorithm (GA) and Particle Swarm Optimization (PSO) are used to distinguish the water regions from the non-water based on spectral information. These algorithms are chosen since they are quite efficient in solving multi-modal optimization problems. These classified images are then segmented using spatial features of the water region to extract the river. From the results obtained, we evaluate the performance of the methods and conclude that incorporating region based image segmentation along with clustering algorithms provides accurate and reliable approach for the extraction of water covered region.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Research in the field of recognizing unlimited vocabulary, online handwritten Indic words is still in its infancy. Most of the focus so far has been in the area of isolated character recognition. In the context of lexicon-free recognition of words, one of the primary issues to be addressed is that of segmentation. As a preliminary attempt, this paper proposes a novel script-independent, lexicon-free method for segmenting online handwritten words to their constituent symbols. Feedback strategies, inspired from neuroscience studies, are proposed for improving the segmentation. The segmentation strategy has been tested on an exhaustive set of 10000 Tamil words collected from a large number of writers. The results show that better segmentation improves the overall recognition performance of the handwriting system.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Medical image segmentation finds application in computer-aided diagnosis, computer-guided surgery, measuring tissue volumes, locating tumors, and pathologies. One approach to segmentation is to use active contours or snakes. Active contours start from an initialization (often manually specified) and are guided by image-dependent forces to the object boundary. Snakes may also be guided by gradient vector fields associated with an image. The first main result in this direction is that of Xu and Prince, who proposed the notion of gradient vector flow (GVF), which is computed iteratively. We propose a new formalism to compute the vector flow based on the notion of bilateral filtering of the gradient field associated with the edge map - we refer to it as the bilateral vector flow (BVF). The range kernel definition that we employ is different from the one employed in the standard Gaussian bilateral filter. The advantage of the BVF formalism is that smooth gradient vector flow fields with enhanced edge information can be computed noniteratively. The quality of image segmentation turned out to be on par with that obtained using the GVF and in some cases better than the GVF.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Scenic word images undergo degradations due to motion blur, uneven illumination, shadows and defocussing, which lead to difficulty in segmentation. As a result, the recognition results reported on the scenic word image datasets of ICDAR have been low. We introduce a novel technique, where we choose the middle row of the image as a sub-image and segment it first. Then, the labels from this segmented sub-image are used to propagate labels to other pixels in the image. This approach, which is unique and distinct from the existing methods, results in improved segmentation. Bayesian classification and Max-flow methods have been independently used for label propagation. This midline based approach limits the impact of degradations that happens to the image. The segmented text image is recognized using the trial version of Omnipage OCR. We have tested our method on ICDAR 2003 and ICDAR 2011 datasets. Our word recognition results of 64.5% and 71.6% are better than those of methods in the literature and also methods that competed in the Robust reading competition. Our method makes an implicit assumption that degradation is not present in the middle row.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we present a segmentation algorithm to extract foreground object motion in a moving camera scenario without any preprocessing step such as tracking selected features, video alignment, or foreground segmentation. By viewing it as a curve fitting problem on advected particle trajectories, we use RANSAC to find the polynomial that best fits the camera motion and identify all trajectories that correspond to the camera motion. The remaining trajectories are those due to the foreground motion. By using the superposition principle, we subtract the motion due to camera from foreground trajectories and obtain the true object-induced trajectories. We show that our method performs on par with state-of-the-art technique, with an execution time speed-up of 10x-40x. We compare the results on real-world datasets such as UCF-ARG, UCF Sports and Liris-HARL. We further show that it can be used toper-form video alignment.