Automatic distinction of Fernando Pessoas’ heteronyms


Autoria(s): Teixeira, João F.; Couto, Marco
Data(s)

2015

Resumo

Text Mining has opened a vast array of possibilities concerning automatic information retrieval from large amounts of text documents. A variety of themes and types of documents can be easily analyzed. More complex features such as those used in Forensic Linguistics can gather deeper understanding from the documents, making possible performing di cult tasks such as author identi cation. In this work we explore the capabilities of simpler Text Mining approaches to author identification of unstructured documents, in particular the ability to distinguish poetic works from two of Fernando Pessoas' heteronyms: Alvaro de Campos and Ricardo Reis. Several processing options were tested and accuracies of 97% were reached, which encourage further developments.

Identificador

Teixeira, J. F., & Couto, M. (2015) Automatic distinction of Fernando Pessoas’ heteronyms. Vol. 9273. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 783-788).

978-3-319-23485-4

978-3-319-23484-7

http://hdl.handle.net/1822/40553

10.1007/978-3-319-23485-4_78

Idioma(s)

eng

Publicador

Springer Verlag

Relação

http://link.springer.com/chapter/10.1007/978-3-319-23485-4_78

Direitos

info:eu-repo/semantics/restrictedAccess

Palavras-Chave #Authorship Classification #Machine Learning #SVM #Text Mining
Tipo

info:eu-repo/semantics/conferenceObject