Author Profiling using SVMs and Word Embedding Averages — Notebook for PAN at CLEF 2016
Data(s) |
06/02/2017
06/02/2017
01/09/2016
|
---|---|
Resumo |
In this paper, we describe one of the approaches of the participation of Universidade de Évora. Our approach is similar to usual methods where text is preprocessed, features are extracted, and then used in SVMs with cross validation. The main difference is that features used come from averages of word embeddings, specifically word2vec vectors. Using PAN 2016 dataset, we were able to achieve 44.8% and 68.2% for English age and gender classification respectively. We were also able to achieve 51.3% and 67.1% accuracy for Spanish age and gender classification. Finally, we report 71.9% accuracy for Dutch age classification. Erasmus Mundus EMMA-WEST project |
Identificador |
Roy Bayot and Teresa Gonçalves. Author Profiling using SVMs and Word Embedding Averages — Notebook for PAN at CLEF 2016. In Krisztian Balog, Linda Cappellato, Nicola Ferro, and Craig Macdonald, editors, Working Notes of CLEF’2016 – Conference and Labs of the Evaluation forum, Évora, Portugal, 5-8 September, 2016., volume 1609, pages 815–823, Évora, PT, September 2016. CEUR. http://hdl.handle.net/10174/20667 nd nd 498 |
Idioma(s) |
eng |
Publicador |
CEUR |
Direitos |
openAccess |
Tipo |
article |