Pattern-based topic models for information filtering


Autoria(s): Gao, Yang; Xu, Yue; Li, Yuefeng
Contribuinte(s)

Cambria, Erik

Chen, Ping

Data(s)

07/12/2013

Resumo

Topic modelling, such as Latent Dirichlet Allocation (LDA), was proposed to generate statistical models to represent multiple topics in a collection of documents, which has been widely utilized in the fields of machine learning and information retrieval, etc. But its effectiveness in information filtering is rarely known. Patterns are always thought to be more representative than single terms for representing documents. In this paper, a novel information filtering model, Pattern-based Topic Model(PBTM) , is proposed to represent the text documents not only using the topic distributions at general level but also using semantic pattern representations at detailed specific level, both of which contribute to the accurate document representation and document relevance ranking. Extensive experiments are conducted to evaluate the effectiveness of PBTM by using the TREC data collection Reuters Corpus Volume 1. The results show that the proposed model achieves outstanding performance.

Formato

application/pdf

Identificador

http://eprints.qut.edu.au/66686/

Relação

http://eprints.qut.edu.au/66686/1/Pattern-based_Topic_Models_for_Information_Filtering_.pdf

Gao, Yang, Xu, Yue, & Li, Yuefeng (2013) Pattern-based topic models for information filtering. In Cambria, Erik & Chen, Ping (Eds.) Sentiment Elicitation from Natural Text for Information Retrieval and Extraction (SENTIRE), 7 December 2013, Dallas, Texas.

Direitos

Copyright 2013 Please consult the authors

Fonte

School of Electrical Engineering & Computer Science; Faculty of Science and Technology; Science & Engineering Faculty

Palavras-Chave #080600 INFORMATION SYSTEMS #Topic modeling #Pattern mining #Information filtering #User interest
Tipo

Conference Paper