The influence of pre-processing on the estimation of readability of web documents
Data(s) |
2015
|
---|---|
Resumo |
This paper investigates the effect that text pre-processing approaches have on the estimation of the readability of web pages. Readability has been highlighted as an important aspect of web search result personalisation in previous work. The most widely used text readability measures rely on surface level characteristics of text, such as the length of words and sentences. We demonstrate that different tools for extracting text from web pages lead to very different estimations of readability. This has an important implication for search engines because search result personalisation strategies that consider users reading ability may fail if incorrect text readability estimations are computed. |
Formato |
application/pdf |
Identificador | |
Publicador |
ACM |
Relação |
http://eprints.qut.edu.au/91421/1/cikm2015_readability.pdf http://dl.acm.org/citation.cfm?id=2806613 DOI:10.1145/2806416.2806613 Palotti, João, Zuccon, Guido, & Hanbury, Allan (2015) The influence of pre-processing on the estimation of readability of web documents. In Proceedings of the 24th ACM International Conference on Information and Knowledge Management, ACM, Melbourne, VIC, pp. 1763-1766. |
Direitos |
Copyright 2015 ACM |
Fonte |
Faculty of Science and Technology; School of Information Systems |
Palavras-Chave | #Readability, Text pre-processing |
Tipo |
Conference Paper |