Biblioteca Digital

923 resultados para hate speech

Comparison of Image Transform-Based Features for Visual Speech Recognition in Clean and Corrupted Videos

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present results of a study into the performance of a variety of different image transform-based feature types for speaker-independent visual speech recognition of isolated digits. This includes the first reported use of features extracted using a discrete curvelet transform. The study will show a comparison of some methods for selecting features of each feature type and show the relative benefits of both static and dynamic visual features. The performance of the features will be tested on both clean video data and also video data corrupted in a variety of ways to assess each feature type's robustness to potential real-world conditions. One of the test conditions involves a novel form of video corruption we call jitter which simulates camera and/or head movement during recording.

Feature extraction for speech and music discrimination

Relevância:

20.00% 20.00%

Publicador:

Speech enhancement in noisy environments for video retrieval

Relevância:

20.00% 20.00%

Publicador:

A Speech Based Approach to Surveillance Video Retrieval

Relevância:

20.00% 20.00%

Publicador:

Hidden Conditional Random Fields for Visual Speech Recognition

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we present the application of Hidden Conditional Random Fields (HCRFs) to modelling speech for visual speech recognition. HCRFs may be easily adapted to model long range dependencies across an observation sequence. As a result visual word recognition performance can be improved as the model is able to take more of a contextual approach to generating state sequences. Results are presented from a speaker-dependent, isolated digit, visual speech recognition task using comparisons with a baseline HMM system. We firstly illustrate that word recognition rates on clean video using HCRFs can be improved by increasing the number of past and future observations being taken into account by each state. Secondly we compare model performances using various levels of video compression on the test set. As far as we are aware this is the first attempted use of HCRFs for visual speech recognition.

Maximizing the continuity in segmentation - a new approach to model, segment and recognize speech

Relevância:

20.00% 20.00%

Publicador:

Combining noise compensation and missing-feature decoding for large vocabulary speech recognition in noise

Relevância:

20.00% 20.00%

Publicador:

Modeling long-range dependencies in speech data for text-independent speaker recognition

Relevância:

20.00% 20.00%

Publicador:

Exploiting Partial Information for Robust Speech and Speaker Recognition

Relevância:

20.00% 20.00%

Publicador:

Hate Crime Against People with Disabilities: A baseline Study of Experiences in Northern Ireland

Relevância:

20.00% 20.00%

Publicador:

A Quadrant Model for the Study of Speech and Thought Presentation

Relevância:

20.00% 20.00%

Publicador:

Data Processing: Imaging of speech data

Relevância:

20.00% 20.00%

Publicador:

Inter-Frame Contextual Modelling For Visual Speech Recognition

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we present a new approach to visual speech recognition which improves contextual modelling by combining Inter-Frame Dependent and Hidden Markov Models. This approach captures contextual information in visual speech that may be lost using a Hidden Markov Model alone. We apply contextual modelling to a large speaker independent isolated digit recognition task, and compare our approach to two commonly adopted feature based techniques for incorporating speech dynamics. Results are presented from baseline feature based systems and the combined modelling technique. We illustrate that both of these techniques achieve similar levels of performance when used independently. However significant improvements in performance can be achieved through a combination of the two. In particular we report an improvement in excess of 17% relative Word Error Rate in comparison to our best baseline system.

Prosodic and related features that signify emotional colouring in conversational speech.

Relevância:

20.00% 20.00%

Publicador:

Emotion in speech: Towards an integration of linguistic, paralinguistic, and psychological analysis

Relevância:

20.00% 20.00%

Publicador:

«
1
2
...
33
34
35
36
37
38
39
...
61
62
»