Document clustering algorithms, representations and evaluation for information retrieval


Autoria(s): De Vries, Christopher M.
Data(s)

2014

Resumo

This thesis presents new methods for classification and thematic grouping of billions of web pages, at scales previously not achievable. This process is also known as document clustering, where similar documents are automatically associated with clusters that represent various distinct topic. These automatically discovered topics are in turn used to improve search engine performance by only searching the topics that are deemed relevant to particular user queries.

Formato

application/pdf

Identificador

http://eprints.qut.edu.au/75862/

Publicador

Queensland University of Technology

Relação

http://eprints.qut.edu.au/75862/1/Christopher_De%20Vries_Thesis.pdf

De Vries, Christopher M. (2014) Document clustering algorithms, representations and evaluation for information retrieval. PhD by Publication, Queensland University of Technology.

Fonte

School of Electrical Engineering & Computer Science; Science & Engineering Faculty

Palavras-Chave #document clustering #representations #evaluation #information retrieval #algorithms #clustering #hashing #signatures #efficiency #machine learning
Tipo

Thesis