382 resultados para Word and image


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Clustering identities in a video is a useful task to aid in video search, annotation and retrieval, and cast identification. However, reliably clustering faces across multiple videos is challenging task due to variations in the appearance of the faces, as videos are captured in an uncontrolled environment. A person's appearance may vary due to session variations including: lighting and background changes, occlusions, changes in expression and make up. In this paper we propose the novel Local Total Variability Modelling (Local TVM) approach to cluster faces across a news video corpus; and incorporate this into a novel two stage video clustering system. We first cluster faces within a single video using colour, spatial and temporal cues; after which we use face track modelling and hierarchical agglomerative clustering to cluster faces across the entire corpus. We compare different face recognition approaches within this framework. Experiments on a news video database show that the Local TVM technique is able effectively model the session variation observed in the data, resulting in improved clustering performance, with much greater computational efficiency than other methods.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper we investigate the effectiveness of class specific sparse codes in the context of discriminative action classification. The bag-of-words representation is widely used in activity recognition to encode features, and although it yields state-of-the art performance with several feature descriptors it still suffers from large quantization errors and reduces the overall performance. Recently proposed sparse representation methods have been shown to effectively represent features as a linear combination of an over complete dictionary by minimizing the reconstruction error. In contrast to most of the sparse representation methods which focus on Sparse-Reconstruction based Classification (SRC), this paper focuses on a discriminative classification using a SVM by constructing class-specific sparse codes for motion and appearance separately. Experimental results demonstrates that separate motion and appearance specific sparse coefficients provide the most effective and discriminative representation for each class compared to a single class-specific sparse coefficients.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper presents 'vSpeak', the first initiative taken in Pakistan for ICT enabled conversion of dynamic Sign Urdu gestures into natural language sentences. To realize this, vSpeak has adopted a novel approach for feature extraction using edge detection and image compression which gives input to the Artificial Neural Network that recognizes the gesture. This technique caters for the blurred images as well. The training and testing is currently being performed on a dataset of 200 patterns of 20 words from Sign Urdu with target accuracy of 90% and above.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

‘Every face on Vanity Fair’s Hollywood covers 1995-2008’ renders an ethnographic-like study of Hollywood celebrity as a cinematic experience. Viewers are presented with constantly mutating portraits that violently twist and shear into other faces, while an immersive soundscape echoes the turbulent painterly surface. Through technical processes of scaling, looping and image morphing; the work explores a positive affectual response to the seductive power of celebrity imagery. Conceptually, given Vanity Fair magazine’s prestigious stature, the work also performs an ethnographic-mapping of the popularity of Hollywood stars over time, while at the same time creating in-between, ‘mutant’ versions of their visages. The installation explores the potential for fan-based responses to pop culture to lead to artworks that enable a more critical response to the subjective and intersubjective dynamics of celebrity portraiture. Questions are raised about how these cultural forms impact pop culture fans, and their role in the mapping of culture and social experience.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper presents a novel crop detection system applied to the challenging task of field sweet pepper (capsicum) detection. The field-grown sweet pepper crop presents several challenges for robotic systems such as the high degree of occlusion and the fact that the crop can have a similar colour to the background (green on green). To overcome these issues, we propose a two-stage system that performs per-pixel segmentation followed by region detection. The output of the segmentation is used to search for highly probable regions and declares these to be sweet pepper. We propose the novel use of the local binary pattern (LBP) to perform crop segmentation. This feature improves the accuracy of crop segmentation from an AUC of 0.10, for previously proposed features, to 0.56. Using the LBP feature as the basis for our two-stage algorithm, we are able to detect 69.2% of field grown sweet peppers in three sites. This is an impressive result given that the average detection accuracy of people viewing the same colour imagery is 66.8%.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In a world consumed by quests for happiness and personal growth, Grant Stevens’ new exhibition, Dark Mess, delves into the psychic troubles that sometimes lie below the canopy. Working predominantly with video, photography and installation, Stevens’ practice explores how the verbal and non-verbal languages of popular screen culture interface with contemporary subjectivity. This exhibition continues Stevens’ interest in the natural environment as a catalyst and proxy for introspection and self-discovery. Pushing sound and image to distortion, Dark Mess offers a disquieting journey through the undergrowth.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Scene understanding has been investigated from a mainly visual information point of view. Recently depth has been provided an extra wealth of information, allowing more geometric knowledge to fuse into scene understanding. Yet to form a holistic view, especially in robotic applications, one can create even more data by interacting with the world. In fact humans, when growing up, seem to heavily investigate the world around them by haptic exploration. We show an application of haptic exploration on a humanoid robot in cooperation with a learning method for object segmentation. The actions performed consecutively improve the segmentation of objects in the scene.