Biblioteca Digital

Robust bimodal person identification using face and speech with limited training data and corruption of both modalities

**Autoria(s):** McLaughlin, N.; Ji, Ming; Crookes, D.
Data(s)	01/01/2011
Resumo	This paper presents a novel method of audio-visual fusion for person identification where both the speech and facial modalities may be corrupted, and there is a lack of prior knowledge about the corruption. Furthermore, we assume there is a limited amount of training data for each modality (e.g., a short training speech segment and a single training facial image for each person). A new representation and a modified cosine similarity are introduced for combining and comparing bimodal features with limited training data as well as vastly differing data rates and feature sizes. Optimal feature selection and multicondition training are used to reduce the mismatch between training and testing, thereby making the system robust to unknown bimodal corruption. Experiments have been carried out on a bimodal data set created from the SPIDRE and AR databases with variable noise corruption of speech and occlusion in the face images. The new method has demonstrated improved recognition accuracy.
Formato	application/pdf
Identificador	http://pure.qub.ac.uk/portal/en/publications/robust-bimodal-person-identification-using-face-and-speech-with-limited-training-data-and-corruption-of-both-modalities(4fada32a-8d59-4eaf-b5a1-b44a1eaa35ba).html http://pure.qub.ac.uk/ws/files/16286083/McLaughlin_Interspeech_2011.pdf http://www.scopus.com/inward/record.url?partnerID=yv4JPVwI&eid=2-s2.0-84865734209&md5=5e5f40531ca79cbbed0d1bd9384eadcb
Idioma(s)	eng
Direitos	info:eu-repo/semantics/openAccess
Fonte	McLaughlin , N , Ji , M & Crookes , D 2011 , Robust bimodal person identification using face and speech with limited training data and corruption of both modalities . in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH . pp. 585-588 .
Palavras-Chave	#/dk/atira/pure/subjectarea/asjc/1200/1203 #Language and Linguistics #/dk/atira/pure/subjectarea/asjc/1700/1709 #Human-Computer Interaction #/dk/atira/pure/subjectarea/asjc/1700/1711 #Signal Processing #/dk/atira/pure/subjectarea/asjc/1700/1712 #Software #/dk/atira/pure/subjectarea/asjc/2600/2611 #Modelling and Simulation
Tipo	contributionToPeriodical

Acesso ao item digital