813 resultados para Video endoscopia
Resumo:
Rate control regulates the instantaneous video bit -rate to maximize a picture quality metric while satisfying channel constraints. Typically, a quality metric such as Peak Signalto-Noise ratio (PSNR) or weighted signal -to-noise ratio(WSNR) is chosen out of convenience. However this metric is not always truly representative of perceptual video quality.Attempts to use perceptual metrics in rate control have been limited by the accuracy of the video quality metrics chosen.Recently, new and improved metrics of subjective quality such as the Video quality experts group's (VQEG) NTIA1 General Video Quality Model (VQM) have been proven to have strong correlation with subjective quality. Here, we apply the key principles of the NTIA -VQM model to rate control in order to maximize perceptual video quality. Our experiments demonstrate that applying NTIA -VQM motivated metrics to standard TMN8 rate control in an H.263 encoder results in perceivable quality improvements over a baseline TMN8 / MSE based implementation.
Resumo:
Non-Identical Duplicate video detection is a challenging research problem. Non-Identical Duplicate video are a pair of videos that are not exactly identical but are almost similar.In this paper, we evaluate two methods - Keyframe -based and Tomography-based methods to determine the Non-Identical Duplicate videos. These two methods make use of the existing scale based shift invariant (SIFT) method to find the match between the key frames in first method, and the cross-sections through the temporal axis of the videos in second method.We provide extensive experimental results and the analysis of accuracy and efficiency of the above two methods on a data set of Non- Identical Duplicate video-pair.
Resumo:
Image and video filtering is a key image-processing task in computer vision especially in noisy environment. In most of the cases the noise source is unknown and hence possess a major difficulty in the filtering operation. In this paper we present an error-correction based learning approach for iterative filtering. A new FIR filter is designed in which the filter coefficients are updated based on Widrow-Hoff rule. Unlike the standard filter the proposed filter has the ability to remove noise without the a priori knowledge of the noise. Experimental result shows that the proposed filter efficiently removes the noise and preserves the edges in the image. We demonstrate the capability of the proposed algorithm by testing it on standard images infected by Gaussian noise and on a real time video containing inherent noise. Experimental result shows that the proposed filter is better than some of the existing standard filters
Resumo:
Video streaming applications have hitherto been supported by single server systems. A major drawback of such a solution is that it increases the server load. The server restricts the number of clients that can be simultaneously supported due to limitation in bandwidth. The constraints of a single server system can be overcome in video streaming if we exploit the endless resources available in a distributed and networked system. We explore a P2P system for streaming video applications. In this paper we build a P2P streaming video (SVP2P) service in which multiple peers co-operate to serve video segments for new requests, thereby reducing server load and bandwidth used. Our simulation shows the playback latency using SVP2P is roughly 1/4th of the latency incurred when the server directly streams the video. Bandwidth consumed for control messages (overhead) is as low as 1.5% of the total data transfered. The most important observation is that the capacity of the SVP2P grows dynamically.
Resumo:
Prediction of variable bit rate compressed video traffic is critical to dynamic allocation of resources in a network. In this paper, we propose a technique for preprocessing the dataset used for training a video traffic predictor. The technique involves identifying the noisy instances in the data using a fuzzy inference system. We focus on three prediction techniques, namely, linear regression, neural network and support vector regression and analyze their performance on H.264 video traces. Our experimental results reveal that data preprocessing greatly improves the performance of linear regression and neural network, but is not effective on support vector regression.
Resumo:
Video decoders used in emerging applications need to be flexible to handle a large variety of video formats and deliver scalable performance to handle wide variations in workloads. In this paper we propose a unified software and hardware architecture for video decoding to achieve scalable performance with flexibility. The light weight processor tiles and the reconfigurable hardware tiles in our architecture enable software and hardware implementations to co-exist, while a programmable interconnect enables dynamic interconnection of the tiles. Our process network oriented compilation flow achieves realization agnostic application partitioning and enables seamless migration across uniprocessor, multi-processor, semi hardware and full hardware implementations of a video decoder. An application quality of service aware scheduler monitors and controls the operation of the entire system. We prove the concept through a prototype of the architecture on an off-the-shelf FPGA. The FPGA prototype shows a scaling in performance from QCIF to 1080p resolutions in four discrete steps. We also demonstrate that the reconfiguration time is short enough to allow migration from one configuration to the other without any frame loss.
Resumo:
Advertising is ubiquitous in the online community and more so in the ever-growing and popular online video delivery websites (e. g., YouTube). Video advertising is becoming increasingly popular on these websites. In addition to the existing pre-roll/post-roll advertising and contextual advertising, this paper proposes an in-stream video advertising strategy-Computational Affective Video-in-Video Advertising (CAVVA). Humans being emotional creatures are driven by emotions as well as rational thought. We believe that emotions play a major role in influencing the buying behavior of users and hence propose a video advertising strategy which takes into account the emotional impact of the videos as well as advertisements. Given a video and a set of advertisements, we identify candidate advertisement insertion points (step 1) and also identify the suitable advertisements (step 2) according to theories from marketing and consumer psychology. We formulate this two part problem as a single optimization function in a non-linear 0-1 integer programming framework and provide a genetic algorithm based solution. We evaluate CAVVA using a subjective user-study and eye-tracking experiment. Through these experiments, we demonstrate that CAVVA achieves a good balance between the following seemingly conflicting goals of (a) minimizing the user disturbance because of advertisement insertion while (b) enhancing the user engagement with the advertising content. We compare our method with existing advertising strategies and show that CAVVA can enhance the user's experience and also help increase the monetization potential of the advertising content.
Resumo:
In this paper, we have proposed a simple and effective approach to classify H.264 compressed videos, by capturing orientation information from the motion vectors. Our major contribution involves computing Histogram of Oriented Motion Vectors (HOMV) for overlapping hierarchical Space-Time cubes. The Space-Time cubes selected are partially overlapped. HOMV is found to be very effective to define the motion characteristics of these cubes. We then use Bag of Features (B OF) approach to define the video as histogram of HOMV keywords, obtained using k-means clustering. The video feature, thus computed, is found to be very effective in classifying videos. We demonstrate our results with experiments on two large publicly available video database.
Resumo:
This paper discusses a novel high-speed approach for human action recognition in H. 264/AVC compressed domain. The proposed algorithm utilizes cues from quantization parameters and motion vectors extracted from the compressed video sequence for feature extraction and further classification using Support Vector Machines (SVM). The ultimate goal of our work is to portray a much faster algorithm than pixel domain counterparts, with comparable accuracy, utilizing only the sparse information from compressed video. Partial decoding rules out the complexity of full decoding, and minimizes computational load and memory usage, which can effect in reduced hardware utilization and fast recognition results. The proposed approach can handle illumination changes, scale, and appearance variations, and is robust in outdoor as well as indoor testing scenarios. We have tested our method on two benchmark action datasets and achieved more than 85% accuracy. The proposed algorithm classifies actions with speed (>2000 fps) approximately 100 times more than existing state-of-the-art pixel-domain algorithms.
Resumo:
H. 264/advanced video coding surveillance video encoders use the Skip mode specified by the standard to reduce bandwidth. They also use multiple frames as reference for motion-compensated prediction. In this paper, we propose two techniques to reduce the bandwidth and computational cost of static camera surveillance video encoders without affecting detection and recognition performance. A spatial sampler is proposed to sample pixels that are segmented using a Gaussian mixture model. Modified weight updates are derived for the parameters of the mixture model to reduce floating point computations. A storage pattern of the parameters in memory is also modified to improve cache performance. Skip selection is performed using the segmentation results of the sampled pixels. The second contribution is a low computational cost algorithm to choose the reference frames. The proposed reference frame selection algorithm reduces the cost of coding uncovered background regions. We also study the number of reference frames required to achieve good coding efficiency. Distortion over foreground pixels is measured to quantify the performance of the proposed techniques. Experimental results show bit rate savings of up to 94.5% over methods proposed in literature on video surveillance data sets. The proposed techniques also provide up to 74.5% reduction in compression complexity without increasing the distortion over the foreground regions in the video sequence.
Resumo:
This paper discusses a novel high-speed approach for human action recognition in H.264/AVC compressed domain. The proposed algorithm utilizes cues from quantization parameters and motion vectors extracted from the compressed video sequence for feature extraction and further classification using Support Vector Machines (SVM). The ultimate goal of the proposed work is to portray a much faster algorithm than pixel domain counterparts, with comparable accuracy, utilizing only the sparse information from compressed video. Partial decoding rules out the complexity of full decoding, and minimizes computational load and memory usage, which can result in reduced hardware utilization and faster recognition results. The proposed approach can handle illumination changes, scale, and appearance variations, and is robust to outdoor as well as indoor testing scenarios. We have evaluated the performance of the proposed method on two benchmark action datasets and achieved more than 85 % accuracy. The proposed algorithm classifies actions with speed (> 2,000 fps) approximately 100 times faster than existing state-of-the-art pixel-domain algorithms.
Resumo:
In this paper, we propose a technique for video object segmentation using patch seams across frames. Typically, seams, which are connected paths of low energy, are utilised for retargeting, where the primary aim is to reduce the image size while preserving the salient image contents. Here, we adapt the formulation of seams for temporal label propagation. The energy function associated with the proposed video seams provides temporal linking of patches across frames, to accurately segment the object. The proposed energy function takes into account the similarity of patches along the seam, temporal consistency of motion and spatial coherency of seams. Label propagation is achieved with high fidelity in the critical boundary regions, utilising the proposed patch seams. To achieve this without additional overheads, we curtail the error propagation by formulating boundary regions as rough-sets. The proposed approach out-perform state-of-the-art supervised and unsupervised algorithms, on benchmark datasets.