Multiple instance dictionary learning for activity representation


Autoria(s): Umakanthan, Sabanadesan; Denman, Simon; Fookes, Clinton; Sridharan, Sridha
Data(s)

2014

Resumo

This paper presents an effective feature representation method in the context of activity recognition. Efficient and effective feature representation plays a crucial role not only in activity recognition, but also in a wide range of applications such as motion analysis, tracking, 3D scene understanding etc. In the context of activity recognition, local features are increasingly popular for representing videos because of their simplicity and efficiency. While they achieve state-of-the-art performance with low computational requirements, their performance is still limited for real world applications due to a lack of contextual information and models not being tailored to specific activities. We propose a new activity representation framework to address the shortcomings of the popular, but simple bag-of-words approach. In our framework, first multiple instance SVM (mi-SVM) is used to identify positive features for each action category and the k-means algorithm is used to generate a codebook. Then locality-constrained linear coding is used to encode the features into the generated codebook, followed by spatio-temporal pyramid pooling to convey the spatio-temporal statistics. Finally, an SVM is used to classify the videos. Experiments carried out on two popular datasets with varying complexity demonstrate significant performance improvement over the base-line bag-of-feature method.

Formato

application/pdf

Identificador

http://eprints.qut.edu.au/92904/

Publicador

IEEE

Relação

http://eprints.qut.edu.au/92904/1/ICPR2014_2.pdf

DOI:10.1109/ICPR.2014.246

Umakanthan, Sabanadesan, Denman, Simon, Fookes, Clinton, & Sridharan, Sridha (2014) Multiple instance dictionary learning for activity representation. In 22nd International Conference on Pattern Recognition (ICPR 2014), IEEE, Stockholm, Sweden, pp. 1377-1382.

Direitos

Copyright 2014 IEEE

Fonte

School of Electrical Engineering & Computer Science; Science & Engineering Faculty

Palavras-Chave #080104 Computer Vision #080109 Pattern Recognition and Data Mining #090609 Signal Processing
Tipo

Conference Paper