Biblioteca Digital

Deep Head Pose: Gaze-Direction Estimation in Multimodal Video

**Autoria(s):** Mukherjee, Sankha S.; Robertson, Neil Martin
Data(s)	01/11/2015
Resumo	In this paper we present a convolutional neuralnetwork (CNN)-based model for human head pose estimation inlow-resolution multi-modal RGB-D data. We pose the problemas one of classification of human gazing direction. We furtherfine-tune a regressor based on the learned deep classifier. Next wecombine the two models (classification and regression) to estimateapproximate regression confidence. We present state-of-the-artresults in datasets that span the range of high-resolution humanrobot interaction (close up faces plus depth information) data tochallenging low resolution outdoor surveillance data. We buildupon our robust head-pose estimation and further introduce anew visual attention model to recover interaction with theenvironment. Using this probabilistic model, we show thatmany higher level scene understanding like human-human/sceneinteraction detection can be achieved. Our solution runs inreal-time on commercial hardware
Identificador	http://pure.qub.ac.uk/portal/en/publications/deep-head-pose-gazedirection-estimation-in-multimodal-video(4405d052-1f67-48d5-a00f-4bf4bbe38d40).html http://dx.doi.org/10.1109/TMM.2015.2482819
Idioma(s)	eng
Direitos	info:eu-repo/semantics/openAccess
Fonte	Mukherjee , S S & Robertson , N M 2015 , ' Deep Head Pose: Gaze-Direction Estimation in Multimodal Video ' IEEE Transactions on Multimedia , vol 17 , no. 11 , pp. 2094-2107 . DOI: 10.1109/TMM.2015.2482819
Palavras-Chave	#Convolutional neural networks (CNNs), deep learning, gaze direction, head-pose, RGB-D
Tipo	article

Acesso ao item digital