Using complex networks to quantify consistency in the use of words
Contribuinte(s) |
UNIVERSIDADE DE SÃO PAULO |
---|---|
Data(s) |
01/11/2013
01/11/2013
2012
|
Resumo |
In this paper we have quantified the consistency of word usage in written texts represented by complex networks, where words were taken as nodes, by measuring the degree of preservation of the node neighborhood. Words were considered highly consistent if the authors used them with the same neighborhood. When ranked according to the consistency of use, the words obeyed a log-normal distribution, in contrast to Zipf's law that applies to the frequency of use. Consistency correlated positively with the familiarity and frequency of use, and negatively with ambiguity and age of acquisition. An inspection of some highly consistent words confirmed that they are used in very limited semantic contexts. A comparison of consistency indices for eight authors indicated that these indices may be employed for author recognition. Indeed, as expected, authors of novels could be distinguished from those who wrote scientific texts. Our analysis demonstrated the suitability of the consistency indices, which can now be applied in other tasks, such as emotion recognition. FAPESP FAPESP CNPq (Brazil) CNPq (Brazil) |
Identificador |
JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, BRISTOL, v. 64, n. 2, supl. 4, Part 1-2, pp. 143-152, JAN, 2012 1742-5468 http://www.producao.usp.br/handle/BDPI/37742 10.1088/1742-5468/2012/01/P01004 |
Idioma(s) |
eng |
Publicador |
IOP PUBLISHING LTD BRISTOL |
Relação |
JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT |
Direitos |
restrictedAccess Copyright IOP PUBLISHING LTD |
Palavras-Chave | #DATA MINING (EXPERIMENT) #PATTERN FORMATION (EXPERIMENT) #RANDOM GRAPHS #NETWORKS #COMMUNICATION #SUPPLY AND INFORMATION NETWORKS #HUMAN LANGUAGE #INHERITANCE #CHARACTER #BIGRAMS #LENGTH #WORLD #MECHANICS #PHYSICS, MATHEMATICAL |
Tipo |
article original article publishedVersion |