Deep Head Pose: Gaze-Direction Estimation in Multimodal Video
Data(s) |
01/11/2015
|
---|---|
Resumo |
In this paper we present a convolutional neuralnetwork (CNN)-based model for human head pose estimation inlow-resolution multi-modal RGB-D data. We pose the problemas one of classification of human gazing direction. We furtherfine-tune a regressor based on the learned deep classifier. Next wecombine the two models (classification and regression) to estimateapproximate regression confidence. We present state-of-the-artresults in datasets that span the range of high-resolution humanrobot interaction (close up faces plus depth information) data tochallenging low resolution outdoor surveillance data. We buildupon our robust head-pose estimation and further introduce anew visual attention model to recover interaction with theenvironment. Using this probabilistic model, we show thatmany higher level scene understanding like human-human/sceneinteraction detection can be achieved. Our solution runs inreal-time on commercial hardware |
Identificador | |
Idioma(s) |
eng |
Direitos |
info:eu-repo/semantics/openAccess |
Fonte |
Mukherjee , S S & Robertson , N M 2015 , ' Deep Head Pose: Gaze-Direction Estimation in Multimodal Video ' IEEE Transactions on Multimedia , vol 17 , no. 11 , pp. 2094-2107 . DOI: 10.1109/TMM.2015.2482819 |
Palavras-Chave | #Convolutional neural networks (CNNs), deep learning, gaze direction, head-pose, RGB-D |
Tipo |
article |