Robust Multimodal Person Identification With Limited Training Data


Autoria(s): McLaughlin, Niall; Ji, Ming; Crookes, Danny
Data(s)

01/03/2013

Resumo

This paper presents a novel method of audio-visual feature-level fusion for person identification where both the speech and facial modalities may be corrupted, and there is a lack of prior knowledge about the corruption. Furthermore, we assume there are limited amount of training data for each modality (e.g., a short training speech segment and a single training facial image for each person). A new multimodal feature representation and a modified cosine similarity are introduced to combine and compare bimodal features with limited training data, as well as vastly differing data rates and feature sizes. Optimal feature selection and multicondition training are used to reduce the mismatch between training and testing, thereby making the system robust to unknown bimodal corruption. Experiments have been carried out on a bimodal dataset created from the SPIDRE speaker recognition database and AR face recognition database with variable noise corruption of speech and occlusion in the face images. The system's speaker identification performance on the SPIDRE database, and facial identification performance on the AR database, is comparable with the literature. Combining both modalities using the new method of multimodal fusion leads to significantly improved accuracy over the unimodal systems, even when both modalities have been corrupted. The new method also shows improved identification accuracy compared with the bimodal systems based on multicondition model training or missing-feature decoding alone.

Formato

application/pdf

Identificador

http://pure.qub.ac.uk/portal/en/publications/robust-multimodal-person-identification-with-limited-training-data(44df62c0-346d-461e-a424-46339233dcc5).html

http://dx.doi.org/10.1109/TSMCC.2012.2227959

http://pure.qub.ac.uk/ws/files/4126175/manuscript.pdf

Idioma(s)

eng

Direitos

info:eu-repo/semantics/openAccess

Fonte

McLaughlin , N , Ji , M & Crookes , D 2013 , ' Robust Multimodal Person Identification With Limited Training Data ' IEEE Transactions on Human Machine Systems , vol 43 , no. 2 , 6461532 , pp. 214 - 224 . DOI: 10.1109/TSMCC.2012.2227959

Palavras-Chave #/dk/atira/pure/subjectarea/asjc/1700/1705 #Computer Networks and Communications #/dk/atira/pure/subjectarea/asjc/3300/3307 #Human Factors and Ergonomics #/dk/atira/pure/subjectarea/asjc/1700/1711 #Signal Processing #/dk/atira/pure/subjectarea/asjc/1700/1702 #Artificial Intelligence #/dk/atira/pure/subjectarea/asjc/2200/2207 #Control and Systems Engineering #/dk/atira/pure/subjectarea/asjc/1700/1709 #Human-Computer Interaction #/dk/atira/pure/subjectarea/asjc/1700/1706 #Computer Science Applications
Tipo

article