Prediction of age, sentiment, and connectivity from social media text


Autoria(s): Nguyen, Thin; Phung, Dinh; Adams, Brett; Venkatesh, Svetha
Contribuinte(s)

Bouguettaya, Athman

Hauswirth, Manfred

Liu, Ling

Data(s)

01/01/2011

Resumo

Social media corpora, including the textual output of blogs, forums, and messaging applications, provide fertile ground for linguistic analysis material diverse in topic and style, and at Web scale. We investigate manifest properties of textual messages, including latent topics, psycholinguistic features, and author mood, of a large corpus of blog posts, to analyze the impact of age, emotion, and social connectivity. These properties are found to be significantly different across the examined cohorts, which suggest discriminative features for a number of useful classification tasks. We build binary classifiers for old versus young bloggers, social versus solo bloggers, and happy versus sad posts with high performance. Analysis of discriminative features shows that age turns upon choice of topic, whereas sentiment orientation is evidenced by linguistic style. Good prediction is achieved for social connectivity using topic and linguistic features, leaving tagged mood a modest role in all classifications.<br />

Identificador

http://hdl.handle.net/10536/DRO/DU:30044670

Idioma(s)

eng

Publicador

Springer-Verlag

Relação

http://dro.deakin.edu.au/eserv/DU:30044670/phung-predictionofage-2011.pdf

http://dro.deakin.edu.au/eserv/DU:30044670/phung-predictionofage-evidence-2011.pdf

http://dx.doi.org/10.1007/978-3-642-24434-6_17

Direitos

2011, Springer-Verlag Berlin Heidelberg

Palavras-Chave #binary classifiers #bloggers #classification tasks #discriminative features #linguistic analysis #linguistic features #social media
Tipo

Conference Paper