The ranking based constrained document clustering method and its application to social event detection
Contribuinte(s) |
Bhowmick, Sourav Dyreson, Curtis Jensen, Christian Lee, Mong Muliantara, Agus Thalheim, Bernhard |
---|---|
Data(s) |
21/04/2014
|
Resumo |
With the growing size and variety of social media files on the web, it’s becoming critical to efficiently organize them into clusters for further processing. This paper presents a novel scalable constrained document clustering method that harnesses the power of search engines capable of dealing with large text data. Instead of calculating distance between the documents and all of the clusters’ centroids, a neighborhood of best cluster candidates is chosen using a document ranking scheme. To make the method faster and less memory dependable, the in-memory and in-database processing are combined in a semi-incremental manner. This method has been extensively tested in the social event detection application. Empirical analysis shows that the proposed method is efficient both in computation and memory usage while producing notable accuracy. |
Identificador | |
Publicador |
Springer International Publishing |
Relação |
DOI:10.1007/978-3-319-05813-9_4 Sutanto, Taufik & Nayak, Richi (2014) The ranking based constrained document clustering method and its application to social event detection. Lecture Notes in Computer Science, 8422, pp. 47-60. |
Direitos |
Copyright 2014 Springer International Publishing Switzerland |
Fonte |
School of Electrical Engineering & Computer Science; Institute for Creative Industries and Innovation; Science & Engineering Faculty |
Palavras-Chave | #080109 Pattern Recognition and Data Mining #080604 Database Management #080704 Information Retrieval and Web Search #constrained clustering #ranking #social event detection |
Tipo |
Journal Article |