Biblioteca Digital

Supervised latent dirichlet allocation models for efficient activity representation

**Autoria(s):** Umakanthan, Sabanadesan; Denman, Simon; Fookes, Clinton B.; Sridharan, Sridha
Contribuinte(s)	Bouzerdoum, Abdesselam Wang, Lei Ogunbona, Philip
Data(s)	2014
Resumo	Local spatio-temporal features with a Bag-of-visual words model is a popular approach used in human action recognition. Bag-of-features methods suffer from several challenges such as extracting appropriate appearance and motion features from videos, converting extracted features appropriate for classification and designing a suitable classification framework. In this paper we address the problem of efficiently representing the extracted features for classification to improve the overall performance. We introduce two generative supervised topic models, maximum entropy discrimination LDA (MedLDA) and class- specific simplex LDA (css-LDA), to encode the raw features suitable for discriminative SVM based classification. Unsupervised LDA models disconnect topic discovery from the classification task, hence yield poor results compared to the baseline Bag-of-words framework. On the other hand supervised LDA techniques learn the topic structure by considering the class labels and improve the recognition accuracy significantly. MedLDA maximizes likelihood and within class margins using max-margin techniques and yields a sparse highly discriminative topic structure; while in css-LDA separate class specific topics are learned instead of common set of topics across the entire dataset. In our representation first topics are learned and then each video is represented as a topic proportion vector, i.e. it can be comparable to a histogram of topics. Finally SVM classification is done on the learned topic proportion vector. We demonstrate the efficiency of the above two representation techniques through the experiments carried out in two popular datasets. Experimental results demonstrate significantly improved performance compared to the baseline Bag-of-features framework which uses kmeans to construct histogram of words from the feature vectors.
Identificador	http://eprints.qut.edu.au/82220/
Publicador	IEEE
Relação	DOI:10.1109/DICTA.2014.7008130 Umakanthan, Sabanadesan, Denman, Simon, Fookes, Clinton B., & Sridharan, Sridha (2014) Supervised latent dirichlet allocation models for efficient activity representation. In Bouzerdoum, Abdesselam, Wang, Lei, & Ogunbona, Philip (Eds.) Proceedings of the 2014 International Conference on Digital Image Computing: Techniques and Applications (DICTA), IEEE, Wollongong, NSW, pp. 1-6.
Direitos	Copyright 2014 by IEEE
Fonte	School of Electrical Engineering & Computer Science; Science & Engineering Faculty
Palavras-Chave	#Feature extraction #Gesture recognition #Image classification #Image representation #Statistical analysis #Support vector machines #Unsupervised learning
Tipo	Conference Paper

Acesso ao item digital