954 resultados para video images
Resumo:
Advertising is ubiquitous in the online community and more so in the ever-growing and popular online video delivery websites (e. g., YouTube). Video advertising is becoming increasingly popular on these websites. In addition to the existing pre-roll/post-roll advertising and contextual advertising, this paper proposes an in-stream video advertising strategy-Computational Affective Video-in-Video Advertising (CAVVA). Humans being emotional creatures are driven by emotions as well as rational thought. We believe that emotions play a major role in influencing the buying behavior of users and hence propose a video advertising strategy which takes into account the emotional impact of the videos as well as advertisements. Given a video and a set of advertisements, we identify candidate advertisement insertion points (step 1) and also identify the suitable advertisements (step 2) according to theories from marketing and consumer psychology. We formulate this two part problem as a single optimization function in a non-linear 0-1 integer programming framework and provide a genetic algorithm based solution. We evaluate CAVVA using a subjective user-study and eye-tracking experiment. Through these experiments, we demonstrate that CAVVA achieves a good balance between the following seemingly conflicting goals of (a) minimizing the user disturbance because of advertisement insertion while (b) enhancing the user engagement with the advertising content. We compare our method with existing advertising strategies and show that CAVVA can enhance the user's experience and also help increase the monetization potential of the advertising content.
Resumo:
In this paper, we have proposed a simple and effective approach to classify H.264 compressed videos, by capturing orientation information from the motion vectors. Our major contribution involves computing Histogram of Oriented Motion Vectors (HOMV) for overlapping hierarchical Space-Time cubes. The Space-Time cubes selected are partially overlapped. HOMV is found to be very effective to define the motion characteristics of these cubes. We then use Bag of Features (B OF) approach to define the video as histogram of HOMV keywords, obtained using k-means clustering. The video feature, thus computed, is found to be very effective in classifying videos. We demonstrate our results with experiments on two large publicly available video database.
Resumo:
The model-based image reconstruction approaches in photoacoustic tomography have a distinct advantage compared to traditional analytical methods for cases where limited data is available. These methods typically deploy Tikhonov based regularization scheme to reconstruct the initial pressure from the boundary acoustic data. The model-resolution for these cases represents the blur induced by the regularization scheme. A method that utilizes this blurring model and performs the basis pursuit deconvolution to improve the quantitative accuracy of the reconstructed photoacoustic image is proposed and shown to be superior compared to other traditional methods via three numerical experiments. Moreover, this deconvolution including the building of an approximate blur matrix is achieved via the Lanczos bidagonalization (least-squares QR) making this approach attractive in real-time. (C) 2014 Optical Society of America
Resumo:
This paper discusses a novel high-speed approach for human action recognition in H. 264/AVC compressed domain. The proposed algorithm utilizes cues from quantization parameters and motion vectors extracted from the compressed video sequence for feature extraction and further classification using Support Vector Machines (SVM). The ultimate goal of our work is to portray a much faster algorithm than pixel domain counterparts, with comparable accuracy, utilizing only the sparse information from compressed video. Partial decoding rules out the complexity of full decoding, and minimizes computational load and memory usage, which can effect in reduced hardware utilization and fast recognition results. The proposed approach can handle illumination changes, scale, and appearance variations, and is robust in outdoor as well as indoor testing scenarios. We have tested our method on two benchmark action datasets and achieved more than 85% accuracy. The proposed algorithm classifies actions with speed (>2000 fps) approximately 100 times more than existing state-of-the-art pixel-domain algorithms.
Resumo:
Head pose classification from surveillance images acquired with distant, large field-of-view cameras is difficult as faces are captured at low-resolution and have a blurred appearance. Domain adaptation approaches are useful for transferring knowledge from the training (source) to the test (target) data when they have different attributes, minimizing target data labeling efforts in the process. This paper examines the use of transfer learning for efficient multi-view head pose classification with minimal target training data under three challenging situations: (i) where the range of head poses in the source and target images is different, (ii) where source images capture a stationary person while target images capture a moving person whose facial appearance varies under motion due to changing perspective, scale and (iii) a combination of (i) and (ii). On the whole, the presented methods represent novel transfer learning solutions employed in the context of multi-view head pose classification. We demonstrate that the proposed solutions considerably outperform the state-of-the-art through extensive experimental validation. Finally, the DPOSE dataset compiled for benchmarking head pose classification performance with moving persons, and to aid behavioral understanding applications is presented in this work.
Resumo:
H. 264/advanced video coding surveillance video encoders use the Skip mode specified by the standard to reduce bandwidth. They also use multiple frames as reference for motion-compensated prediction. In this paper, we propose two techniques to reduce the bandwidth and computational cost of static camera surveillance video encoders without affecting detection and recognition performance. A spatial sampler is proposed to sample pixels that are segmented using a Gaussian mixture model. Modified weight updates are derived for the parameters of the mixture model to reduce floating point computations. A storage pattern of the parameters in memory is also modified to improve cache performance. Skip selection is performed using the segmentation results of the sampled pixels. The second contribution is a low computational cost algorithm to choose the reference frames. The proposed reference frame selection algorithm reduces the cost of coding uncovered background regions. We also study the number of reference frames required to achieve good coding efficiency. Distortion over foreground pixels is measured to quantify the performance of the proposed techniques. Experimental results show bit rate savings of up to 94.5% over methods proposed in literature on video surveillance data sets. The proposed techniques also provide up to 74.5% reduction in compression complexity without increasing the distortion over the foreground regions in the video sequence.
Resumo:
Representing images and videos in the form of compact codes has emerged as an important research interest in the vision community, in the context of web scale image/video search. Recently proposed Vector of Locally Aggregated Descriptors (VLAD), has been shown to outperform the existing retrieval techniques, while giving a desired compact representation. VLAD aggregates the local features of an image in the feature space. In this paper, we propose to represent the local features extracted from an image, as sparse codes over an over-complete dictionary, which is obtained by K-SVD based dictionary training algorithm. The proposed VLAD aggregates the residuals in the space of these sparse codes, to obtain a compact representation for the image. Experiments are performed over the `Holidays' database using SIFT features. The performance of the proposed method is compared with the original VLAD. The 4% increment in the mean average precision (mAP) indicates the better retrieval performance of the proposed sparse coding based VLAD.
Resumo:
Breast cancer is one of the leading cause of cancer related deaths in women and early detection is crucial for reducing mortality rates. In this paper, we present a novel and fully automated approach based on tissue transition analysis for lesion detection in breast ultrasound images. Every candidate pixel is classified as belonging to the lesion boundary, lesion interior or normal tissue based on its descriptor value. The tissue transitions are modeled using a Markov chain to estimate the likelihood of a candidate lesion region. Experimental evaluation on a clinical dataset of 135 images show that the proposed approach can achieve high sensitivity (95 %) with modest (3) false positives per image. The approach achieves very similar results (94 % for 3 false positives) on a completely different clinical dataset of 159 images without retraining, highlighting the robustness of the approach.
Resumo:
Images obtained through fluorescence microscopy at low numerical aperture (NA) are noisy and have poor resolution. Images of specimens such as F-actin filaments obtained using confocal or widefield fluorescence microscopes contain directional information and it is important that an image smoothing or filtering technique preserve the directionality. F-actin filaments are widely studied in pathology because the abnormalities in actin dynamics play a key role in diagnosis of cancer, cardiac diseases, vascular diseases, myofibrillar myopathies, neurological disorders, etc. We develop the directional bilateral filter as a means of filtering out the noise in the image without significantly altering the directionality of the F-actin filaments. The bilateral filter is anisotropic to start with, but we add an additional degree of anisotropy by employing an oriented domain kernel for smoothing. The orientation is locally adapted using a structure tensor and the parameters of the bilateral filter are optimized for within the framework of statistical risk minimization. We show that the directional bilateral filter has better denoising performance than the traditional Gaussian bilateral filter and other denoising techniques such as SURE-LET, non-local means, and guided image filtering at various noise levels in terms of peak signal-to-noise ratio (PSNR). We also show quantitative improvements in low NA images of F-actin filaments. (C) 2015 Author(s). All article content, except where otherwise noted, is licensed under a Creative Commons Attribution 3.0 Unported License.
Resumo:
This paper discusses a novel high-speed approach for human action recognition in H.264/AVC compressed domain. The proposed algorithm utilizes cues from quantization parameters and motion vectors extracted from the compressed video sequence for feature extraction and further classification using Support Vector Machines (SVM). The ultimate goal of the proposed work is to portray a much faster algorithm than pixel domain counterparts, with comparable accuracy, utilizing only the sparse information from compressed video. Partial decoding rules out the complexity of full decoding, and minimizes computational load and memory usage, which can result in reduced hardware utilization and faster recognition results. The proposed approach can handle illumination changes, scale, and appearance variations, and is robust to outdoor as well as indoor testing scenarios. We have evaluated the performance of the proposed method on two benchmark action datasets and achieved more than 85 % accuracy. The proposed algorithm classifies actions with speed (> 2,000 fps) approximately 100 times faster than existing state-of-the-art pixel-domain algorithms.
Resumo:
We propose optimal bilateral filtering techniques for Gaussian noise suppression in images. To achieve maximum denoising performance via optimal filter parameter selection, we adopt Stein's unbiased risk estimate (SURE)-an unbiased estimate of the mean-squared error (MSE). Unlike MSE, SURE is independent of the ground truth and can be used in practical scenarios where the ground truth is unavailable. In our recent work, we derived SURE expressions in the context of the bilateral filter and proposed SURE-optimal bilateral filter (SOBF). We selected the optimal parameters of SOBF using the SURE criterion. To further improve the denoising performance of SOBF, we propose variants of SOBF, namely, SURE-optimal multiresolution bilateral filter (SMBF), which involves optimal bilateral filtering in a wavelet framework, and SURE-optimal patch-based bilateral filter (SPBF), where the bilateral filter parameters are optimized on small image patches. Using SURE guarantees automated parameter selection. The multiresolution and localized denoising in SMBF and SPBF, respectively, yield superior denoising performance when compared with the globally optimal SOBF. Experimental validations and comparisons show that the proposed denoisers perform on par with some state-of-the-art denoising techniques. (C) 2015 SPIE and IS&T
Resumo:
In this paper, we propose a technique for video object segmentation using patch seams across frames. Typically, seams, which are connected paths of low energy, are utilised for retargeting, where the primary aim is to reduce the image size while preserving the salient image contents. Here, we adapt the formulation of seams for temporal label propagation. The energy function associated with the proposed video seams provides temporal linking of patches across frames, to accurately segment the object. The proposed energy function takes into account the similarity of patches along the seam, temporal consistency of motion and spatial coherency of seams. Label propagation is achieved with high fidelity in the critical boundary regions, utilising the proposed patch seams. To achieve this without additional overheads, we curtail the error propagation by formulating boundary regions as rough-sets. The proposed approach out-perform state-of-the-art supervised and unsupervised algorithms, on benchmark datasets.
Resumo:
Real time anomaly detection is the need of the hour for any security applications. In this article, we have proposed a real time anomaly detection for H.264 compressed video streams utilizing pre-encoded motion vectors (MVs). The proposed work is principally motivated by the observation that MVs have distinct characteristics during anomaly than usual. Our observation shows that H.264 MV magnitude and orientation contain relevant information which can be used to model the usual behavior (UB) effectively. This is subsequently extended to detect abnormality/anomaly based on the probability of occurrence of a behavior. The performance of the proposed algorithm was evaluated and bench-marked on UMN and Ped anomaly detection video datasets, with a detection rate of 70 frames per sec resulting in 90x and 250x speedup, along with on-par detection accuracy compared to the state-of-the-art algorithms.
Resumo:
Imaging flow cytometry is an emerging technology that combines the statistical power of flow cytometry with spatial and quantitative morphology of digital microscopy. It allows high-throughput imaging of cells with good spatial resolution, while they are in flow. This paper proposes a general framework for the processing/classification of cells imaged using imaging flow cytometer. Each cell is localized by finding an accurate cell contour. Then, features reflecting cell size, circularity and complexity are extracted for the classification using SVM. Unlike the conventional iterative, semi-automatic segmentation algorithms such as active contour, we propose a noniterative, fully automatic graph-based cell localization. In order to evaluate the performance of the proposed framework, we have successfully classified unstained label-free leukaemia cell-lines MOLT, K562 and HL60 from video streams captured using custom fabricated cost-effective microfluidics-based imaging flow cytometer. The proposed system is a significant development in the direction of building a cost-effective cell analysis platform that would facilitate affordable mass screening camps looking cellular morphology for disease diagnosis. Lay description In this article, we propose a novel framework for processing the raw data generated using microfluidics based imaging flow cytometers. Microfluidics microscopy or microfluidics based imaging flow cytometry (mIFC) is a recent microscopy paradigm, that combines the statistical power of flow cytometry with spatial and quantitative morphology of digital microscopy, which allows us imaging cells while they are in flow. In comparison to the conventional slide-based imaging systems, mIFC is a nascent technology enabling high throughput imaging of cells and is yet to take the form of a clinical diagnostic tool. The proposed framework process the raw data generated by the mIFC systems. The framework incorporates several steps: beginning from pre-processing of the raw video frames to enhance the contents of the cell, localising the cell by a novel, fully automatic, non-iterative graph based algorithm, extraction of different quantitative morphological parameters and subsequent classification of cells. In order to evaluate the performance of the proposed framework, we have successfully classified unstained label-free leukaemia cell-lines MOLT, K562 and HL60 from video streams captured using cost-effective microfluidics based imaging flow cytometer. The cell lines of HL60, K562 and MOLT were obtained from ATCC (American Type Culture Collection) and are separately cultured in the lab. Thus, each culture contains cells from its own category alone and thereby provides the ground truth. Each cell is localised by finding a closed cell contour by defining a directed, weighted graph from the Canny edge images of the cell such that the closed contour lies along the shortest weighted path surrounding the centroid of the cell from a starting point on a good curve segment to an immediate endpoint. Once the cell is localised, morphological features reflecting size, shape and complexity of the cells are extracted and used to develop a support vector machine based classification system. We could classify the cell-lines with good accuracy and the results were quite consistent across different cross validation experiments. We hope that imaging flow cytometers equipped with the proposed framework for image processing would enable cost-effective, automated and reliable disease screening in over-loaded facilities, which cannot afford to hire skilled personnel in large numbers. Such platforms would potentially facilitate screening camps in low income group countries; thereby transforming the current health care paradigms by enabling rapid, automated diagnosis for diseases like cancer.