46 resultados para Compressed Objects
Resumo:
In this paper, we have proposed a simple and effective approach to classify H.264 compressed videos, by capturing orientation information from the motion vectors. Our major contribution involves computing Histogram of Oriented Motion Vectors (HOMV) for overlapping hierarchical Space-Time cubes. The Space-Time cubes selected are partially overlapped. HOMV is found to be very effective to define the motion characteristics of these cubes. We then use Bag of Features (B OF) approach to define the video as histogram of HOMV keywords, obtained using k-means clustering. The video feature, thus computed, is found to be very effective in classifying videos. We demonstrate our results with experiments on two large publicly available video database.
Resumo:
Numerous algorithms have been proposed recently for sparse signal recovery in Compressed Sensing (CS). In practice, the number of measurements can be very limited due to the nature of the problem and/or the underlying statistical distribution of the non-zero elements of the sparse signal may not be known a priori. It has been observed that the performance of any sparse signal recovery algorithm depends on these factors, which makes the selection of a suitable sparse recovery algorithm difficult. To take advantage in such situations, we propose to use a fusion framework using which we employ multiple sparse signal recovery algorithms and fuse their estimates to get a better estimate. Theoretical results justifying the performance improvement are shown. The efficacy of the proposed scheme is demonstrated by Monte Carlo simulations using synthetic sparse signals and ECG signals selected from MIT-BIH database.
Resumo:
Recently, it has been shown that fusion of the estimates of a set of sparse recovery algorithms result in an estimate better than the best estimate in the set, especially when the number of measurements is very limited. Though these schemes provide better sparse signal recovery performance, the higher computational requirement makes it less attractive for low latency applications. To alleviate this drawback, in this paper, we develop a progressive fusion based scheme for low latency applications in compressed sensing. In progressive fusion, the estimates of the participating algorithms are fused progressively according to the availability of estimates. The availability of estimates depends on computational complexity of the participating algorithms, in turn on their latency requirement. Unlike the other fusion algorithms, the proposed progressive fusion algorithm provides quick interim results and successive refinements during the fusion process, which is highly desirable in low latency applications. We analyse the developed scheme by providing sufficient conditions for improvement of CS reconstruction quality and show the practical efficacy by numerical experiments using synthetic and real-world data. (C) 2013 Elsevier B.V. All rights reserved.
Resumo:
Although many sparse recovery algorithms have been proposed recently in compressed sensing (CS), it is well known that the performance of any sparse recovery algorithm depends on many parameters like dimension of the sparse signal, level of sparsity, and measurement noise power. It has been observed that a satisfactory performance of the sparse recovery algorithms requires a minimum number of measurements. This minimum number is different for different algorithms. In many applications, the number of measurements is unlikely to meet this requirement and any scheme to improve performance with fewer measurements is of significant interest in CS. Empirically, it has also been observed that the performance of the sparse recovery algorithms also depends on the underlying statistical distribution of the nonzero elements of the signal, which may not be known a priori in practice. Interestingly, it can be observed that the performance degradation of the sparse recovery algorithms in these cases does not always imply a complete failure. In this paper, we study this scenario and show that by fusing the estimates of multiple sparse recovery algorithms, which work with different principles, we can improve the sparse signal recovery. We present the theoretical analysis to derive sufficient conditions for performance improvement of the proposed schemes. We demonstrate the advantage of the proposed methods through numerical simulations for both synthetic and real signals.
Resumo:
This paper discusses a novel high-speed approach for human action recognition in H. 264/AVC compressed domain. The proposed algorithm utilizes cues from quantization parameters and motion vectors extracted from the compressed video sequence for feature extraction and further classification using Support Vector Machines (SVM). The ultimate goal of our work is to portray a much faster algorithm than pixel domain counterparts, with comparable accuracy, utilizing only the sparse information from compressed video. Partial decoding rules out the complexity of full decoding, and minimizes computational load and memory usage, which can effect in reduced hardware utilization and fast recognition results. The proposed approach can handle illumination changes, scale, and appearance variations, and is robust in outdoor as well as indoor testing scenarios. We have tested our method on two benchmark action datasets and achieved more than 85% accuracy. The proposed algorithm classifies actions with speed (>2000 fps) approximately 100 times more than existing state-of-the-art pixel-domain algorithms.
Resumo:
Monte Carlo modeling of light transport in multilayered tissue (MCML) is modified to incorporate objects of various shapes (sphere, ellipsoid, cylinder, or cuboid) with a refractive-index mismatched boundary. These geometries would be useful for modeling lymph nodes, tumors, blood vessels, capillaries, bones, the head, and other body parts. Mesh-based Monte Carlo (MMC) has also been used to compare the results from the MCML with embedded objects (MCML-EO). Our simulation assumes a realistic tissue model and can also handle the transmission/reflection at the object-tissue boundary due to the mismatch of the refractive index. Simulation of MCML-EO takes a few seconds, whereas MMC takes nearly an hour for the same geometry and optical properties. Contour plots of fluence distribution from MCML-EO and MMC correlate well. This study assists one to decide on the tool to use for modeling light propagation in biological tissue with objects of regular shapes embedded in it. For irregular inhomogeneity in the model (tissue), MMC has to be used. If the embedded objects (inhomogeneity) are of regular geometry (shapes), then MCML-EO is a better option, as simulations like Raman scattering, fluorescent imaging, and optical coherence tomography are currently possible only with MCML. (C) 2014 Society of Photo-Optical Instrumentation Engineers (SPIE)
Resumo:
In this work, we have explored the prospect of segmenting crowd flow in H. 264 compressed videos by merely using motion vectors. The motion vectors are extracted by partially decoding the corresponding video sequence in the H. 264 compressed domain. The region of interest ie., crowd flow region is extracted and the motion vectors that spans the region of interest is preprocessed and a collective representation of the motion vectors for the entire video is obtained. The obtained motion vectors for the corresponding video is then clustered by using EM algorithm. Finally, the clusters which converges to a single flow are merged together based on the bhattacharya distance measure between the histogram of the of the orientation of the motion vectors at the boundaries of the clusters. We had implemented our proposed approach on the complex crowd flow dataset provided by 1] and compared our results by using Jaccard measure. Since we are performing crowd flow segmentation in the compressed domain using only motion vectors, our proposed approach performs much faster compared to other pixel domain counterparts still retaining better accuracy.
Resumo:
Rotations in depth are challenging for object vision because features can appear, disappear, be stretched or compressed. Yet we easily recognize objects across views. Are the underlying representations view invariant or dependent? This question has been intensely debated in human vision, but the neuronal representations remain poorly understood. Here, we show that for naturalistic objects, neurons in the monkey inferotemporal (IT) cortex undergo a dynamic transition in time, whereby they are initially sensitive to viewpoint and later encode view-invariant object identity. This transition depended on two aspects of object structure: it was strongest when objects foreshortened strongly across views and were similar to each other. View invariance in IT neurons was present even when objects were reduced to silhouettes, suggesting that it can arise through similarity between external contours of objects across views. Our results elucidate the viewpoint debate by showing that view invariance arises dynamically in IT neurons out of a representation that is initially view dependent.
Resumo:
Large variations in human actions lead to major challenges in computer vision research. Several algorithms are designed to solve the challenges. Algorithms that stand apart, help in solving the challenge in addition to performing faster and efficient manner. In this paper, we propose a human cognition inspired projection based learning for person-independent human action recognition in the H.264/AVC compressed domain and demonstrate a PBL-McRBEN based approach to help take the machine learning algorithms to the next level. Here, we use gradient image based feature extraction process where the motion vectors and quantization parameters are extracted and these are studied temporally to form several Group of Pictures (GoP). The GoP is then considered individually for two different bench mark data sets and the results are classified using person independent human action recognition. The functional relationship is studied using Projection Based Learning algorithm of the Meta-cognitive Radial Basis Function Network (PBL-McRBFN) which has a cognitive and meta-cognitive component. The cognitive component is a radial basis function network while the Meta-Cognitive Component(MCC) employs self regulation. The McC emulates human cognition like learning to achieve better performance. Performance of the proposed approach can handle sparse information in compressed video domain and provides more accuracy than other pixel domain counterparts. Performance of the feature extraction process achieved more than 90% accuracy using the PTIL-McRBFN which catalyzes the speed of the proposed high speed action recognition algorithm. We have conducted twenty random trials to find the performance in GoP. The results are also compared with other well known classifiers in machine learning literature.
Resumo:
In this paper, we propose a H.264/AVC compressed domain human action recognition system with projection based metacognitive learning classifier (PBL-McRBFN). The features are extracted from the quantization parameters and the motion vectors of the compressed video stream for a time window and used as input to the classifier. Since compressed domain analysis is done with noisy, sparse compression parameters, it is a huge challenge to achieve performance comparable to pixel domain analysis. On the positive side, compressed domain allows rapid analysis of videos compared to pixel level analysis. The classification results are analyzed for different values of Group of Pictures (GOP) parameter, time window including full videos. The functional relationship between the features and action labels are established using PBL-McRBFN with a cognitive and meta-cognitive component. The cognitive component is a radial basis function, while the meta-cognitive component employs self-regulation to achieve better performance in subject independent action recognition task. The proposed approach is faster and shows comparable performance with respect to the state-of-the-art pixel domain counterparts. It employs partial decoding, which rules out the complexity of full decoding, and minimizes computational load and memory usage. This results in reduced hardware utilization and increased speed of classification. The results are compared with two benchmark datasets and show more than 90% accuracy using the PBL-McRBFN. The performance for various GOP parameters and group of frames are obtained with twenty random trials and compared with other well-known classifiers in machine learning literature. (C) 2015 Elsevier B.V. All rights reserved.
Resumo:
This paper discusses a novel high-speed approach for human action recognition in H.264/AVC compressed domain. The proposed algorithm utilizes cues from quantization parameters and motion vectors extracted from the compressed video sequence for feature extraction and further classification using Support Vector Machines (SVM). The ultimate goal of the proposed work is to portray a much faster algorithm than pixel domain counterparts, with comparable accuracy, utilizing only the sparse information from compressed video. Partial decoding rules out the complexity of full decoding, and minimizes computational load and memory usage, which can result in reduced hardware utilization and faster recognition results. The proposed approach can handle illumination changes, scale, and appearance variations, and is robust to outdoor as well as indoor testing scenarios. We have evaluated the performance of the proposed method on two benchmark action datasets and achieved more than 85 % accuracy. The proposed algorithm classifies actions with speed (> 2,000 fps) approximately 100 times faster than existing state-of-the-art pixel-domain algorithms.
Resumo:
In this paper, we have proposed an anomaly detection algorithm based on Histogram of Oriented Motion Vectors (HOMV) 1] in sparse representation framework. Usual behavior is learned at each location by sparsely representing the HOMVs over learnt normal feature bases obtained using an online dictionary learning algorithm. In the end, anomaly is detected based on the likelihood of the occurrence of sparse coefficients at that location. The proposed approach is found to be robust compared to existing methods as demonstrated in the experiments on UCSD Ped1 and UCSD Ped2 datasets.
Resumo:
Real time anomaly detection is the need of the hour for any security applications. In this article, we have proposed a real time anomaly detection for H.264 compressed video streams utilizing pre-encoded motion vectors (MVs). The proposed work is principally motivated by the observation that MVs have distinct characteristics during anomaly than usual. Our observation shows that H.264 MV magnitude and orientation contain relevant information which can be used to model the usual behavior (UB) effectively. This is subsequently extended to detect abnormality/anomaly based on the probability of occurrence of a behavior. The performance of the proposed algorithm was evaluated and bench-marked on UMN and Ped anomaly detection video datasets, with a detection rate of 70 frames per sec resulting in 90x and 250x speedup, along with on-par detection accuracy compared to the state-of-the-art algorithms.
Resumo:
Image and video analysis requires rich features that can characterize various aspects of visual information. These rich features are typically extracted from the pixel values of the images and videos, which require huge amount of computation and seldom useful for real-time analysis. On the contrary, the compressed domain analysis offers relevant information pertaining to the visual content in the form of transform coefficients, motion vectors, quantization steps, coded block patterns with minimal computational burden. The quantum of work done in compressed domain is relatively much less compared to pixel domain. This paper aims to survey various video analysis efforts published during the last decade across the spectrum of video compression standards. In this survey, we have included only the analysis part, excluding the processing aspect of compressed domain. This analysis spans through various computer vision applications such as moving object segmentation, human action recognition, indexing, retrieval, face detection, video classification and object tracking in compressed videos.
Resumo:
In gross motion of flexible one-dimensional (1D) objects such as cables, ropes, chains, ribbons and hair, the assumption of constant length is realistic and reasonable. The motion of the object also appears more natural if the motion or disturbance given at one end attenuates along the length of the object. In an earlier work, variational calculus was used to derive natural and length-preserving transformation of planar and spatial curves and implemented for flexible 1D objects discretized with a large number of straight segments. This paper proposes a novel idea to reduce computational effort and enable real-time and realistic simulation of the motion of flexible 1D objects. The key idea is to represent the flexible 1D object as a spline and move the underlying control polygon with much smaller number of segments. To preserve the length of the curve to within a prescribed tolerance as the control polygon is moved, the control polygon is adaptively modified by subdivision and merging. New theoretical results relating the length of the curve and the angle between the adjacent segments of the control polygon are derived for quadratic and cubic splines. Depending on the prescribed tolerance on length error, the theoretical results are used to obtain threshold angles for subdivision and merging. Simulation results for arbitrarily chosen planar and spatial curves whose one end is subjected to generic input motions are provided to illustrate the approach. (C) 2016 Elsevier Ltd. All rights reserved.