Robust bimodal person identification using face and speech with limited training data and corruption of both modalities


Autoria(s): McLaughlin, N.; Ji, Ming; Crookes, D.
Data(s)

01/01/2011

Resumo

This paper presents a novel method of audio-visual fusion for person identification where both the speech and facial modalities may be corrupted, and there is a lack of prior knowledge about the corruption. Furthermore, we assume there is a limited amount of training data for each modality (e.g., a short training speech segment and a single training facial image for each person). A new representation and a modified cosine similarity are introduced for combining and comparing bimodal features with limited training data as well as vastly differing data rates and feature sizes. Optimal feature selection and multicondition training are used to reduce the mismatch between training and testing, thereby making the system robust to unknown bimodal corruption. Experiments have been carried out on a bimodal data set created from the SPIDRE and AR databases with variable noise corruption of speech and occlusion in the face images. The new method has demonstrated improved recognition accuracy.

Formato

application/pdf

Identificador

http://pure.qub.ac.uk/portal/en/publications/robust-bimodal-person-identification-using-face-and-speech-with-limited-training-data-and-corruption-of-both-modalities(4fada32a-8d59-4eaf-b5a1-b44a1eaa35ba).html

http://pure.qub.ac.uk/ws/files/16286083/McLaughlin_Interspeech_2011.pdf

http://www.scopus.com/inward/record.url?partnerID=yv4JPVwI&eid=2-s2.0-84865734209&md5=5e5f40531ca79cbbed0d1bd9384eadcb

Idioma(s)

eng

Direitos

info:eu-repo/semantics/openAccess

Fonte

McLaughlin , N , Ji , M & Crookes , D 2011 , Robust bimodal person identification using face and speech with limited training data and corruption of both modalities . in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH . pp. 585-588 .

Palavras-Chave #/dk/atira/pure/subjectarea/asjc/1200/1203 #Language and Linguistics #/dk/atira/pure/subjectarea/asjc/1700/1709 #Human-Computer Interaction #/dk/atira/pure/subjectarea/asjc/1700/1711 #Signal Processing #/dk/atira/pure/subjectarea/asjc/1700/1712 #Software #/dk/atira/pure/subjectarea/asjc/2600/2611 #Modelling and Simulation
Tipo

contributionToPeriodical