14 resultados para Dance videos

em Indian Institute of Science - Bangalore - Índia


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This work proposes a boosting-based transfer learning approach for head-pose classification from multiple, low-resolution views. Head-pose classification performance is adversely affected when the source (training) and target (test) data arise from different distributions (due to change in face appearance, lighting, etc). Under such conditions, we employ Xferboost, a Logitboost-based transfer learning framework that integrates knowledge from a few labeled target samples with the source model to effectively minimize misclassifications on the target data. Experiments confirm that the Xferboost framework can improve classification performance by up to 6%, when knowledge is transferred between the CLEAR and FBK four-view headpose datasets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

With the increasing availability of wearable cameras, research on first-person view videos (egocentric videos) has received much attention recently. While some effort has been devoted to collecting various egocentric video datasets, there has not been a focused effort in assembling one that could capture the diversity and complexity of activities related to life-logging, which is expected to be an important application for egocentric videos. In this work, we first conduct a comprehensive survey of existing egocentric video datasets. We observe that existing datasets do not emphasize activities relevant to the life-logging scenario. We build an egocentric video dataset dubbed LENA (Life-logging EgoceNtric Activities) (http://people.sutd.edu.sg/similar to 1000892/dataset) which includes egocentric videos of 13 fine-grained activity categories, recorded under diverse situations and environments using the Google Glass. Activities in LENA can also be grouped into 5 top-level categories to meet various needs and multiple demands for activities analysis research. We evaluate state-of-the-art activity recognition using LENA in detail and also analyze the performance of popular descriptors in egocentric activity recognition.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we have proposed an anomaly detection algorithm based on Histogram of Oriented Motion Vectors (HOMV) 1] in sparse representation framework. Usual behavior is learned at each location by sparsely representing the HOMVs over learnt normal feature bases obtained using an online dictionary learning algorithm. In the end, anomaly is detected based on the likelihood of the occurrence of sparse coefficients at that location. The proposed approach is found to be robust compared to existing methods as demonstrated in the experiments on UCSD Ped1 and UCSD Ped2 datasets.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The medieval icons of southern India are among the most acclaimed Indian artistic innovations, especially those of the Chola Tamil kingdom (9th–10th centuries), which is best known for the Hindu iconography of the Dance of Siva that captured the imagination of master sculptor Rodin.1 Apart from these prolific images, however, not much was known about southern Indian copperbased metallurgy. Hence, these often spectacular castings have been regarded as a sudden efflorescence, almost without precedent, of skilled metallurgy as contrasted with tin-rich China or southeast Asia, for instance, where a developed copper-bronze tradition has been better appreciated.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

With the availability of a huge amount of video data on various sources, efficient video retrieval tools are increasingly in demand. Video being a multi-modal data, the perceptions of ``relevance'' between the user provided query video (in case of Query-By-Example type of video search) and retrieved video clips are subjective in nature. We present an efficient video retrieval method that takes user's feedback on the relevance of retrieved videos and iteratively reformulates the input query feature vectors (QFV) for improved video retrieval. The QFV reformulation is done by a simple, but powerful feature weight optimization method based on Simultaneous Perturbation Stochastic Approximation (SPSA) technique. A video retrieval system with video indexing, searching and relevance feedback (RF) phases is built for demonstrating the performance of the proposed method. The query and database videos are indexed using the conventional video features like color, texture, etc. However, we use the comprehensive and novel methods of feature representations, and a spatio-temporal distance measure to retrieve the top M videos that are similar to the query. In feedback phase, the user activated iterative on the previously retrieved videos is used to reformulate the QFV weights (measure of importance) that reflect the user's preference, automatically. It is our observation that a few iterations of such feedback are generally sufficient for retrieving the desired video clips. The novel application of SPSA based RF for user-oriented feature weights optimization makes the proposed method to be distinct from the existing ones. The experimental results show that the proposed RF based video retrieval exhibit good performance.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We describe a novel method for human activity segmentation and interpretation in surveillance applications based on Gabor filter-bank features. A complex human activity is modeled as a sequence of elementary human actions like walking, running, jogging, boxing, hand-waving etc. Since human silhouette can be modeled by a set of rectangles, the elementary human actions can be modeled as a sequence of a set of rectangles with different orientations and scales. The activity segmentation is based on Gabor filter-bank features and normalized spectral clustering. The feature trajectories of an action category are learnt from training example videos using dynamic time warping. The combined segmentation and the recognition processes are very efficient as both the algorithms share the same framework and Gabor features computed for the former can be used for the later. We have also proposed a simple shadow detection technique to extract good silhouette which is necessary for good accuracy of an action recognition technique.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The objective of this work is to develop a systematic methodology for describing hand postures and grasps which is independent of the kinematics and geometry of the hand model which in turn can be used for developing a universal referencing scheme. It is therefore necessary that the scheme be general enough to describe the continuum of hand poses. Indian traditional classical dance form, “Bharathanatyam”, uses 28 single handed gestures, called “mudras”. A Mudra can be perceived as a hand posture with a specific pattern of finger configurations. Using modifiers, complex mudras could be constructed from relatively simple mudras. An adjacency matrix is constructed to describe the relationship among mudras. Various mudra transitions can be obtained from the graph associated with this matrix. Using this matrix, a hierarchy of the mudras is formed. A set of base mudras and modifiers are used for describing how one simple posture of hand can be transformed into another relatively complex one. A canonical set of predefined hand postures and modifiers can be used in digital human modeling to develop standard hand posture libraries.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Non-Identical Duplicate video detection is a challenging research problem. Non-Identical Duplicate video are a pair of videos that are not exactly identical but are almost similar.In this paper, we evaluate two methods - Keyframe -based and Tomography-based methods to determine the Non-Identical Duplicate videos. These two methods make use of the existing scale based shift invariant (SIFT) method to find the match between the key frames in first method, and the cross-sections through the temporal axis of the videos in second method.We provide extensive experimental results and the analysis of accuracy and efficiency of the above two methods on a data set of Non- Identical Duplicate video-pair.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The assignment of tasks to multiple resources becomes an interesting game theoretic problem, when both the task owner and the resources are strategic. In the classical, nonstrategic setting, where the states of the tasks and resources are observable by the controller, this problem is that of finding an optimal policy for a Markov decision process (MDP). When the states are held by strategic agents, the problem of an efficient task allocation extends beyond that of solving an MDP and becomes that of designing a mechanism. Motivated by this fact, we propose a general mechanism which decides on an allocation rule for the tasks and resources and a payment rule to incentivize agents' participation and truthful reports. In contrast to related dynamic strategic control problems studied in recent literature, the problem studied here has interdependent values: the benefit of an allocation to the task owner is not simply a function of the characteristics of the task itself and the allocation, but also of the state of the resources. We introduce a dynamic extension of Mezzetti's two phase mechanism for interdependent valuations. In this changed setting, the proposed dynamic mechanism is efficient, within period ex-post incentive compatible, and within period ex-post individually rational.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We consider the problem of extracting a signature representation of similar entities employing covariance descriptors. Covariance descriptors can efficiently represent objects and are robust to scale and pose changes. We posit that covariance descriptors corresponding to similar objects share a common geometrical structure which can be extracted through joint diagonalization. We term this diagonalizing matrix as the Covariance Profile (CP). CP can be used to measure the distance of a novel object to an object set through the diagonality measure. We demonstrate how CP can be employed on images as well as for videos, for applications such as face recognition and object-track clustering.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Real-time object tracking is a critical task in many computer vision applications. Achieving rapid and robust tracking while handling changes in object pose and size, varying illumination and partial occlusion, is a challenging task given the limited amount of computational resources. In this paper we propose a real-time object tracker in l(1) framework addressing these issues. In the proposed approach, dictionaries containing templates of overlapping object fragments are created. The candidate fragments are sparsely represented in the dictionary fragment space by solving the l(1) regularized least squares problem. The non zero coefficients indicate the relative motion between the target and candidate fragments along with a fidelity measure. The final object motion is obtained by fusing the reliable motion information. The dictionary is updated based on the object likelihood map. The proposed tracking algorithm is tested on various challenging videos and found to outperform earlier approach.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Advertising is ubiquitous in the online community and more so in the ever-growing and popular online video delivery websites (e. g., YouTube). Video advertising is becoming increasingly popular on these websites. In addition to the existing pre-roll/post-roll advertising and contextual advertising, this paper proposes an in-stream video advertising strategy-Computational Affective Video-in-Video Advertising (CAVVA). Humans being emotional creatures are driven by emotions as well as rational thought. We believe that emotions play a major role in influencing the buying behavior of users and hence propose a video advertising strategy which takes into account the emotional impact of the videos as well as advertisements. Given a video and a set of advertisements, we identify candidate advertisement insertion points (step 1) and also identify the suitable advertisements (step 2) according to theories from marketing and consumer psychology. We formulate this two part problem as a single optimization function in a non-linear 0-1 integer programming framework and provide a genetic algorithm based solution. We evaluate CAVVA using a subjective user-study and eye-tracking experiment. Through these experiments, we demonstrate that CAVVA achieves a good balance between the following seemingly conflicting goals of (a) minimizing the user disturbance because of advertisement insertion while (b) enhancing the user engagement with the advertising content. We compare our method with existing advertising strategies and show that CAVVA can enhance the user's experience and also help increase the monetization potential of the advertising content.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, we have proposed a simple and effective approach to classify H.264 compressed videos, by capturing orientation information from the motion vectors. Our major contribution involves computing Histogram of Oriented Motion Vectors (HOMV) for overlapping hierarchical Space-Time cubes. The Space-Time cubes selected are partially overlapped. HOMV is found to be very effective to define the motion characteristics of these cubes. We then use Bag of Features (B OF) approach to define the video as histogram of HOMV keywords, obtained using k-means clustering. The video feature, thus computed, is found to be very effective in classifying videos. We demonstrate our results with experiments on two large publicly available video database.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Regions in video streams attracting human interest contribute significantly to human understanding of the video. Being able to predict salient and informative Regions of Interest (ROIs) through a sequence of eye movements is a challenging problem. Applications such as content-aware retargeting of videos to different aspect ratios while preserving informative regions and smart insertion of dialog (closed-caption text) into the video stream can significantly be improved using the predicted ROIs. We propose an interactive human-in-the-loop framework to model eye movements and predict visual saliency into yet-unseen frames. Eye tracking and video content are used to model visual attention in a manner that accounts for important eye-gaze characteristics such as temporal discontinuities due to sudden eye movements, noise, and behavioral artifacts. A novel statistical-and algorithm-based method gaze buffering is proposed for eye-gaze analysis and its fusion with content-based features. Our robust saliency prediction is instantiated for two challenging and exciting applications. The first application alters video aspect ratios on-the-fly using content-aware video retargeting, thus making them suitable for a variety of display sizes. The second application dynamically localizes active speakers and places dialog captions on-the-fly in the video stream. Our method ensures that dialogs are faithful to active speaker locations and do not interfere with salient content in the video stream. Our framework naturally accommodates personalisation of the application to suit biases and preferences of individual users.