Effective pattern discovery for text mining


Autoria(s): Zhong, Ning; Li, Yuefeng; Wu, Sheng-Tang
Data(s)

2010

Resumo

Many data mining techniques have been proposed for mining useful patterns in text documents. However, how to effectively use and update discovered patterns is still an open research issue, especially in the domain of text mining. Since most existing text mining methods adopted term-based approaches, they all suffer from the problems of polysemy and synonymy. Over the years, people have often held the hypothesis that pattern (or phrase) based approaches should perform better than the term-based ones, but many experiments did not support this hypothesis. This paper presents an innovative technique, effective pattern discovery which includes the processes of pattern deploying and pattern evolving, to improve the effectiveness of using and updating discovered patterns for finding relevant and interesting information. Substantial experiments on RCV1 data collection and TREC topics demonstrate that the proposed solution achieves encouraging performance.

Formato

application/pdf

Identificador

http://eprints.qut.edu.au/42066/

Publicador

IEEE

Relação

http://eprints.qut.edu.au/42066/1/42066.pdf

DOI:10.1109/TKDE.2010.211

Zhong, Ning, Li, Yuefeng, & Wu, Sheng-Tang (2010) Effective pattern discovery for text mining. IEEE Transactions on Knowledge and Data Engineering, 24(1), pp. 30-44.

http://purl.org/au-research/grants/ARC/DP0988007

Direitos

Copyright 2010 IEEE

Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

Fonte

Faculty of Science and Technology

Palavras-Chave #080600 INFORMATION SYSTEMS #Communities , Computational modeling , Databases , Electronic mail , Noise measurement , Text mining , data mining , information filtering , pattern evolving , pattern mining , text mining
Tipo

Journal Article