945 resultados para H.264
Resumo:
Scalable video coding of H.264/AVC standard enables adaptive and flexible delivery for multiple devices and various network conditions. Only a few works have addressed the influence of different scalability parameters (frame rate, spatial resolution, and SNR) on the user perceived quality within a limited scope. In this paper, we have conducted an experiment of subjective quality assessment for video sequences encoded with H.264/SVC to gain a better understanding of the correlation between video content and UPQ at all scalable layers and the impact of rate-distortion method and different scalabilities on bitrate and UPQ. Findings from this experiment will contribute to a user-centered design of adaptive delivery of scalable video stream.
Resumo:
This paper presents the architecture and the VHDL design of an integer 2-D DCT used in the H.264/AVC. The 2-D DCT computation is performed by exploiting it’s orthogonality and separability property. The symmetry of the forward and inverse transform is used in this implementation. To reduce the computation overhead for the addition, subtraction and multiplication operations, we analyze the suitability of carry-free position independent residue number system (RNS) for the implementation of 2-D DCT. The implementation has been carried out in VHDL for Altera FPGA. We used the negative number representation in RNS, bit width analysis of the transforms and dedicated registers present in the Logic element of the FPGA to optimize the area. The complexity and efficiency analysis show that the proposed architecture could provide higher through-put.
Resumo:
This paper reports the design of an input-triggered polymorphic ASIC for H.264 baseline decoder. Hardware polymorphism is achieved by selectively reusing hardware resources at system and module level. Complete design is done using ESL design tools following a methodology that maintains consistency in testing and verification throughout the design flow. The proposed design can support frame sizes from QCIF to 1080p.
Resumo:
Run-time interoperability between different applications based on H.264/AVC is an emerging need in networked infotainment, where media delivery must match the desired resolution and quality of the end terminals. In this paper, we describe the architecture and design of a polymorphic ASIC to support this. The H.264 decoding flow is partitioned into modules, such that the polymorphic ASIC meets the design goals of low-power, low-area, high flexibility, high throughput and fast interoperability between different profiles and levels of H.264. We demonstrate the idea with a multi-mode decoder that can decode baseline, main and high profile H.264 streams and can interoperate at run.time across these profiles. The decoder is capable of processing frame sizes of up to 1024 times 768 at 30 fps. The design synthesized with UMC 0.13 mum technology, occupies 250 k gates and runs at 100 MHz.
Resumo:
H.264 video standard achieves high quality video along with high data compression when compared to other existing video standards. H.264 uses context-based adaptive variable length coding (CAVLC) to code residual data in Baseline profile. In this paper we describe a novel architecture for CAVLC decoder including coeff-token decoder, level decoder total-zeros decoder and run-before decoder UMC library in 0.13 mu CMOS technology is used to synthesize the proposed design. The proposed design reduces chip area and improves critical path performance of CAVLC decoder in comparison with [1]. Macroblock level (including luma and chroma) pipeline processing for CAVLC is implemented with an average of 141 cycles (including pipeline buffering) per macroblock at 250MHz clock frequency. To compare our results with [1] clock frequency is constrained to 125MHz. The area required for the proposed architecture is 17586 gates, which is 22.1% improvement in comparison to [1]. We obtain a throughput of 1.73 * 10(6) macroblocks/second, which is 28% higher than that reported in [1]. The proposed design meets the processing requirement of 1080HD [5] video at 30frames/seconds.
Resumo:
High performance video standards use prediction techniques to achieve high picture quality at low bit rates. The type of prediction decides the bit rates and the image quality. Intra Prediction achieves high video quality with significant reduction in bit rate. This paper present an area optimized architecture for Intra prediction, for H.264 decoding at HDTV resolution with a target of achieving 60 fps. The architecture was validated on Virtex-5 FPGA based platform. The architecture achieves a frame rate of 64 fps. The architecture is based on multi-level memory hierarchy to reduce latency and ensure optimum resources utilization. It removes redundancy by reusing same functional blocks across different modes. The proposed architecture uses only 13% of the total LUTs available on the Xilinx FPGA XC5VLX50T.
Resumo:
In this paper we present a novel macroblock mode decision algorithm to speedup H.264/SVC Intra frame encoding. We replace the complex mode-decision calculations by a classifier which has been trained specifically to minimize the reduction in RD performance. This results in a significant speedup in encoding. The results show that machine learning has a great potential and can reduce the complexity substantially with negligible impact on quality. The results show that the proposed method reduces encoding time to about 70% in base layer and up to 50% in enhancement layer of the reference implementation with a negligible loss in quality.
Resumo:
This paper presents the design of the area optimized integer two dimensional discrete cosine transform (2-D DCT) used in H.264/AVC codecs. The 2-D DCT calculation is performed by utilizing the separability property, in such a way that 2-D DCT is divided into two 1-D DCT calculation that are joined through a common memory. Due to its area optimized approach, the design will find application in mobile devices. Verilog hardware description language (HDL) in cadence environment has been used for design, compilation, simulation and synthesis of transform block in 0.18 mu TSMC technology.
Resumo:
H.264 is a video codec standard which delivers high resolution video even at low bit rates. To provide high throughput at low bit rates hardware implementations are essential. In this paper, we propose hardware implementations for speed and area optimized DCT and quantizer modules. To target above criteria we propose two architectures. First architecture is speed optimized which gives a high throughput and can meet requirements of 4096x2304 frame at 30 frames/sec. Second architecture is area optimized and occupies 2009 LUTs in Altera’s stratix-II and can meet the requirements of 1080HD at 30 frames/sec.
Resumo:
High performance video standards use prediction techniques to achieve high picture quality at low bit rates. The type of prediction decides the bit rates and the image quality. Intra Prediction achieves high video quality with significant reduction in bit rate. This paper presents novel area optimized architecture for Intra prediction of H.264 decoding at HDTV resolution. The architecture has been validated on a Xilinx Virtex-5 FPGA based platform and achieved a frame rate of 64 fps. The architecture is based on multi-level memory hierarchy to reduce latency and ensure optimum resources utilization. It removes redundancy by reusing same functional blocks across different modes. The proposed architecture uses only 13% of the total LUTs available on the Xilinx FPGA XC5VLX50T.
Resumo:
In this paper, we have proposed a simple and effective approach to classify H.264 compressed videos, by capturing orientation information from the motion vectors. Our major contribution involves computing Histogram of Oriented Motion Vectors (HOMV) for overlapping hierarchical Space-Time cubes. The Space-Time cubes selected are partially overlapped. HOMV is found to be very effective to define the motion characteristics of these cubes. We then use Bag of Features (B OF) approach to define the video as histogram of HOMV keywords, obtained using k-means clustering. The video feature, thus computed, is found to be very effective in classifying videos. We demonstrate our results with experiments on two large publicly available video database.
Resumo:
This paper discusses a novel high-speed approach for human action recognition in H. 264/AVC compressed domain. The proposed algorithm utilizes cues from quantization parameters and motion vectors extracted from the compressed video sequence for feature extraction and further classification using Support Vector Machines (SVM). The ultimate goal of our work is to portray a much faster algorithm than pixel domain counterparts, with comparable accuracy, utilizing only the sparse information from compressed video. Partial decoding rules out the complexity of full decoding, and minimizes computational load and memory usage, which can effect in reduced hardware utilization and fast recognition results. The proposed approach can handle illumination changes, scale, and appearance variations, and is robust in outdoor as well as indoor testing scenarios. We have tested our method on two benchmark action datasets and achieved more than 85% accuracy. The proposed algorithm classifies actions with speed (>2000 fps) approximately 100 times more than existing state-of-the-art pixel-domain algorithms.
Resumo:
H. 264/advanced video coding surveillance video encoders use the Skip mode specified by the standard to reduce bandwidth. They also use multiple frames as reference for motion-compensated prediction. In this paper, we propose two techniques to reduce the bandwidth and computational cost of static camera surveillance video encoders without affecting detection and recognition performance. A spatial sampler is proposed to sample pixels that are segmented using a Gaussian mixture model. Modified weight updates are derived for the parameters of the mixture model to reduce floating point computations. A storage pattern of the parameters in memory is also modified to improve cache performance. Skip selection is performed using the segmentation results of the sampled pixels. The second contribution is a low computational cost algorithm to choose the reference frames. The proposed reference frame selection algorithm reduces the cost of coding uncovered background regions. We also study the number of reference frames required to achieve good coding efficiency. Distortion over foreground pixels is measured to quantify the performance of the proposed techniques. Experimental results show bit rate savings of up to 94.5% over methods proposed in literature on video surveillance data sets. The proposed techniques also provide up to 74.5% reduction in compression complexity without increasing the distortion over the foreground regions in the video sequence.
Resumo:
In this work, we have explored the prospect of segmenting crowd flow in H. 264 compressed videos by merely using motion vectors. The motion vectors are extracted by partially decoding the corresponding video sequence in the H. 264 compressed domain. The region of interest ie., crowd flow region is extracted and the motion vectors that spans the region of interest is preprocessed and a collective representation of the motion vectors for the entire video is obtained. The obtained motion vectors for the corresponding video is then clustered by using EM algorithm. Finally, the clusters which converges to a single flow are merged together based on the bhattacharya distance measure between the histogram of the of the orientation of the motion vectors at the boundaries of the clusters. We had implemented our proposed approach on the complex crowd flow dataset provided by 1] and compared our results by using Jaccard measure. Since we are performing crowd flow segmentation in the compressed domain using only motion vectors, our proposed approach performs much faster compared to other pixel domain counterparts still retaining better accuracy.
Resumo:
In this paper, we propose a H.264/AVC compressed domain human action recognition system with projection based metacognitive learning classifier (PBL-McRBFN). The features are extracted from the quantization parameters and the motion vectors of the compressed video stream for a time window and used as input to the classifier. Since compressed domain analysis is done with noisy, sparse compression parameters, it is a huge challenge to achieve performance comparable to pixel domain analysis. On the positive side, compressed domain allows rapid analysis of videos compared to pixel level analysis. The classification results are analyzed for different values of Group of Pictures (GOP) parameter, time window including full videos. The functional relationship between the features and action labels are established using PBL-McRBFN with a cognitive and meta-cognitive component. The cognitive component is a radial basis function, while the meta-cognitive component employs self-regulation to achieve better performance in subject independent action recognition task. The proposed approach is faster and shows comparable performance with respect to the state-of-the-art pixel domain counterparts. It employs partial decoding, which rules out the complexity of full decoding, and minimizes computational load and memory usage. This results in reduced hardware utilization and increased speed of classification. The results are compared with two benchmark datasets and show more than 90% accuracy using the PBL-McRBFN. The performance for various GOP parameters and group of frames are obtained with twenty random trials and compared with other well-known classifiers in machine learning literature. (C) 2015 Elsevier B.V. All rights reserved.