Classification of Texts' Authorship Using a Regression Model on Compressed Data
Data(s) |
20/07/2016
20/07/2016
2013
|
---|---|
Resumo |
2010 Mathematics Subject Classification: 68T50,62H30,62J05. An algorithm for text authorship identification is proposed. The procedure is based on the Kolmogorov complexity and uses regression models on the length of the compressed texts. The classification employs the regression parameters estimates. Different combinations of compressor parameters and the preliminary processing on the data are examined using prose texts of a few English classics. |
Identificador |
Pliska Studia Mathematica Bulgarica, Vol. 22, No 1, (2013), 25p-32p 0204-9805 |
Idioma(s) |
en |
Publicador |
Institute of Mathematics and Informatics Bulgarian Academy of Sciences |
Palavras-Chave | #Text authorship identification #Classification #Compression #Linear Regression |
Tipo |
Article |