Learning multi-faceted activities from heterogeneous data with the product space hierarchical Dirichlet processes


Autoria(s): Nguyen, Thanh- Bing; Nguyen, Vu; Venkatesh, Svetha; Phung, Dinh
Contribuinte(s)

Cao, Huiping

Li, Jinyan

Wang, Ruili

Data(s)

01/01/2015

Resumo

Hierarchical Dirichlet processes (HDP) was originally designed and experimented for a single data channel. In this paper we enhanced its ability to model heterogeneous data using a richer structure for the base measure being a product-space. The enhanced model, called Product Space HDP (PS-HDP), can (1) simultaneously model heterogeneous data from multiple sources in a Bayesian nonparametric framework and (2) discover multilevel latent structures from data to result in different types of topics/latent structures that can be explained jointly. We experimented with the MDC dataset, a large and real-world data collected from mobile phones. Our goal was to discover identity–location– time (a.k.a who-where-when) patterns at different levels (globally for all groups and locally for each group). We provided analysis on the activities and patterns learned from our model, visualized, compared and contrasted with the ground-truth to demonstrate the merit of the proposed framework. We further quantitatively evaluated and reported its performance using standard metrics including F1-score, NMI, RI, and purity. We also compared the performance of the PS-HDP model with those of popular existing clustering methods (including K-Means, NNMF, GMM, DP-Means, and AP). Lastly, we demonstrate the ability of the model in learning activities with missing data, a common problem encountered in pervasive and ubiquitous computing applications.

Identificador

http://hdl.handle.net/10536/DRO/DU:30085932

Idioma(s)

eng

Publicador

Springer

Relação

http://dro.deakin.edu.au/eserv/DU:30085932/nguyen-learningmulti-evid-2015.pdf

http://www.dx.doi.org/10.1007/978-3-319-42996-0_11

Direitos

2015, Springer

Palavras-Chave #artificial intelligence #data mining and knowledge discovery #health informatics
Tipo

Book Chapter