A two-stage approach for generating topic models
Contribuinte(s) |
Pei, Jian Tseng, Vincent S. Cao, Longbing Motoda, Hiroshi Xu, Guandong |
---|---|
Data(s) |
01/04/2013
|
Resumo |
Topic modeling has been widely utilized in the fields of information retrieval, text mining, text classification etc. Most existing statistical topic modeling methods such as LDA and pLSA generate a term based representation to represent a topic by selecting single words from multinomial word distribution over this topic. There are two main shortcomings: firstly, popular or common words occur very often across different topics that bring ambiguity to understand topics; secondly, single words lack coherent semantic meaning to accurately represent topics. In order to overcome these problems, in this paper, we propose a two-stage model that combines text mining and pattern mining with statistical modeling to generate more discriminative and semantic rich topic representations. Experiments show that the optimized topic representations generated by the proposed methods outperform the typical statistical topic modeling method LDA in terms of accuracy and certainty. |
Formato |
application/pdf |
Identificador | |
Publicador |
Springer Berlin Heidelberg |
Relação |
http://eprints.qut.edu.au/60325/1/A_Two-stage_Approach_for_Generating_Topic_Models.pdf http://link.springer.com/chapter/10.1007%2F978-3-642-37456-2_19 DOI:10.1007/978-3-642-37456-2_19 Xu, Yue, Gao, Yang, Li, Yuefeng, & Liu, Bin (2013) A two-stage approach for generating topic models. In Pei, Jian, Tseng, Vincent S., Cao, Longbing, Motoda, Hiroshi, & Xu, Guandong (Eds.) Lecture Notes in Computer Science : Advances in Knowledge Discovery and Data Mining, Springer Berlin Heidelberg, Gold Coast Convention and Exhibition Centre, Gold Coast, QLD, pp. 221-232. |
Direitos |
Copyright 2013 Springer-Verlag Berlin Heidelberg |
Fonte |
School of Electrical Engineering & Computer Science; Science & Engineering Faculty |
Palavras-Chave | #080201 Analysis of Algorithms and Complexity #080603 Conceptual Modelling #Topic modeling #Topic representation #Tf-idf #Frequent pattern mining #Entropy |
Tipo |
Conference Paper |