987 resultados para Video genre classification


Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we present a convolutional neuralnetwork (CNN)-based model for human head pose estimation inlow-resolution multi-modal RGB-D data. We pose the problemas one of classification of human gazing direction. We furtherfine-tune a regressor based on the learned deep classifier. Next wecombine the two models (classification and regression) to estimateapproximate regression confidence. We present state-of-the-artresults in datasets that span the range of high-resolution humanrobot interaction (close up faces plus depth information) data tochallenging low resolution outdoor surveillance data. We buildupon our robust head-pose estimation and further introduce anew visual attention model to recover interaction with theenvironment. Using this probabilistic model, we show thatmany higher level scene understanding like human-human/sceneinteraction detection can be achieved. Our solution runs inreal-time on commercial hardware

Relevância:

30.00% 30.00%

Publicador:

Resumo:

[ES]This paper describes an analysis performed for facial description in static images and video streams. The still image context is first analyzed in order to decide the optimal classifier configuration for each problem: gender recognition, race classification, and glasses and moustache presence. These results are later applied to significant samples which are automatically extracted in real-time from video streams achieving promising results in the facial description of 70 individuals by means of gender, race and the presence of glasses and moustache.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Tämä diplomityö tarkastelee pelaajatyyppien ja pelaajamotivaatioiden tunnistamista videopeleissä. Aiempi tutkimus tuntee monia pelaajatyyppien malleja, mutta niitä ei ole liiemmin sovellettu käytäntöön peleissä. Tässä työssä suoritetaan systemaattinen kirjallisuuskartoitus erilaisista pelaajatyyppien malleista, jonka pohjalta esitetään useita pelaajien luokittelutapoja. Lisäksi toteutetaan tapaustutkimus, jossa kirjallisuuden pohjalta valitaan pelaajien luokittelumalli ja testataan mallia käytännössä tunnistamalla pelaajatyyppejä data-analytiikan avulla reaaliaikaisessa strategiapelissä.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this thesis, we propose to infer pixel-level labelling in video by utilising only object category information, exploiting the intrinsic structure of video data. Our motivation is the observation that image-level labels are much more easily to be acquired than pixel-level labels, and it is natural to find a link between the image level recognition and pixel level classification in video data, which would transfer learned recognition models from one domain to the other one. To this end, this thesis proposes two domain adaptation approaches to adapt the deep convolutional neural network (CNN) image recognition model trained from labelled image data to the target domain exploiting both semantic evidence learned from CNN, and the intrinsic structures of unlabelled video data. Our proposed approaches explicitly model and compensate for the domain adaptation from the source domain to the target domain which in turn underpins a robust semantic object segmentation method for natural videos. We demonstrate the superior performance of our methods by presenting extensive evaluations on challenging datasets comparing with the state-of-the-art methods.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Image (Video) retrieval is an interesting problem of retrieving images (videos) similar to the query. Images (Videos) are represented in an input (feature) space and similar images (videos) are obtained by finding nearest neighbors in the input representation space. Numerous input representations both in real valued and binary space have been proposed for conducting faster retrieval. In this thesis, we present techniques that obtain improved input representations for retrieval in both supervised and unsupervised settings for images and videos. Supervised retrieval is a well known problem of retrieving same class images of the query. We address the practical aspects of achieving faster retrieval with binary codes as input representations for the supervised setting in the first part, where binary codes are used as addresses into hash tables. In practice, using binary codes as addresses does not guarantee fast retrieval, as similar images are not mapped to the same binary code (address). We address this problem by presenting an efficient supervised hashing (binary encoding) method that aims to explicitly map all the images of the same class ideally to a unique binary code. We refer to the binary codes of the images as `Semantic Binary Codes' and the unique code for all same class images as `Class Binary Code'. We also propose a new class­ based Hamming metric that dramatically reduces the retrieval times for larger databases, where only hamming distance is computed to the class binary codes. We also propose a Deep semantic binary code model, by replacing the output layer of a popular convolutional Neural Network (AlexNet) with the class binary codes and show that the hashing functions learned in this way outperforms the state­ of ­the art, and at the same time provide fast retrieval times. In the second part, we also address the problem of supervised retrieval by taking into account the relationship between classes. For a given query image, we want to retrieve images that preserve the relative order i.e. we want to retrieve all same class images first and then, the related classes images before different class images. We learn such relationship aware binary codes by minimizing the similarity between inner product of the binary codes and the similarity between the classes. We calculate the similarity between classes using output embedding vectors, which are vector representations of classes. Our method deviates from the other supervised binary encoding schemes as it is the first to use output embeddings for learning hashing functions. We also introduce new performance metrics that take into account the related class retrieval results and show significant gains over the state­ of­ the art. High Dimensional descriptors like Fisher Vectors or Vector of Locally Aggregated Descriptors have shown to improve the performance of many computer vision applications including retrieval. In the third part, we will discuss an unsupervised technique for compressing high dimensional vectors into high dimensional binary codes, to reduce storage complexity. In this approach, we deviate from adopting traditional hyperplane hashing functions and instead learn hyperspherical hashing functions. The proposed method overcomes the computational challenges of directly applying the spherical hashing algorithm that is intractable for compressing high dimensional vectors. A practical hierarchical model that utilizes divide and conquer techniques using the Random Select and Adjust (RSA) procedure to compress such high dimensional vectors is presented. We show that our proposed high dimensional binary codes outperform the binary codes obtained using traditional hyperplane methods for higher compression ratios. In the last part of the thesis, we propose a retrieval based solution to the Zero shot event classification problem - a setting where no training videos are available for the event. To do this, we learn a generic set of concept detectors and represent both videos and query events in the concept space. We then compute similarity between the query event and the video in the concept space and videos similar to the query event are classified as the videos belonging to the event. We show that we significantly boost the performance using concept features from other modalities.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Automatic video segmentation plays a vital role in sports videos annotation. This paper presents a fully automatic and computationally efficient algorithm for analysis of sports videos. Various methods of automatic shot boundary detection have been proposed to perform automatic video segmentation. These investigations mainly concentrate on detecting fades and dissolves for fast processing of the entire video scene without providing any additional feedback on object relativity within the shots. The goal of the proposed method is to identify regions that perform certain activities in a scene. The model uses some low-level feature video processing algorithms to extract the shot boundaries from a video scene and to identify dominant colours within these boundaries. An object classification method is used for clustering the seed distributions of the dominant colours to homogeneous regions. Using a simple tracking method a classification of these regions to active or static is performed. The efficiency of the proposed framework is demonstrated over a standard video benchmark with numerous types of sport events and the experimental results show that our algorithm can be used with high accuracy for automatic annotation of active regions for sport videos.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Active video games are an emerging genre of electronic games that provide engaging exercise experiences by combining physical exertion with interactive game play. As such they have attracted increased interest from health promotion professionals to reduce sedentary behavior, increase physical activity, and improve health outcomes such as body composition. However their potential for enhancing the educational experience has not been extensively explored. This paper provides a brief overview of active video game research to date and outlines opportunities for future research. Specifically, we highlight the need to develop a conceptual framework to better understand the determinants, mediators, moderators, and consequences of active video gaming and integrate learning and health outcomes. Wepropose that active video games can be a key part of a wider “digital” supportive environment where education and health researchers and professionals work with, rather than against, video game technologies to promote learning and health.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper reports robustness comparison of clustering-based multi-label classification methods versus nonclustering counterparts for multi-concept associated image and video annotations. In the experimental setting of this paper, we adopted six popular multi-label classification Algorithms, two different base classifiers for problem transformation based multilabel classifications, and three different clustering algorithms for pre-clustering of the training data. We conducted experimental evaluation on two multi-label benchmark datasets: scene image data and mediamill video data. We also employed two multi-label classification evaluation metrics, namely, micro F1-measure and Hamming-loss to present the predictive performance of the classifications. The results reveal that different base classifiers and clustering methods contribute differently to the performance of the multi-label classifications. Overall, the pre-clustering methods improve the effectiveness of multi-label classifications in certain experimental settings. This provides vital information to users when deciding which multi-label classification method to choose for multiple-concept associated image and video annotations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents our approach of identifying the profile of an unknown user based on the activities of known users. The aim of author profiling task of PAN@CLEF 2016 is cross-genre identification of the gender and age of an unknown user. This means training the system using the behavior of different users from one social media platform and identifying the profile of other user on some different platform. Instead of using single classifier to build the system we used a combination of different classifiers, also known as stacking. This approach allowed us explore the strength of all the classifiers and minimize the bias or error enforced by a single classifier.