On the robust measurement of inflectional diversity


Autoria(s): Xanthos A.; Guex G.; Tuzzi A. (ed.); Benesova M. (ed.); Macutek J. (ed.)
Data(s)

2015

Resumo

Lexical diversity measures are notoriously sensitive to variations of sample size and recent approaches to this issue typically involve the computation of the average variety of lexical units in random subsamples of fixed size. This methodology has been further extended to measures of inflectional diversity such as the average number of wordforms per lexeme, also known as the mean size of paradigm (MSP) index. In this contribution we argue that, while random sampling can indeed be used to increase the robustness of inflectional diversity measures, using a fixed subsample size is only justified under the hypothesis that the corpora that we compare have the same degree of lexematic diversity. In the more general case where they may have differing degrees of lexematic diversity, a more sophisticated strategy can and should be adopted. A novel approach to the measurement of inflectional diversity is proposed, aiming to cope not only with variations of sample size, but also with variations of lexematic diversity. The robustness of this new method is empirically assessed and the results show that while there is still room for improvement, the proposed methodology considerably attenuates the impact of lexematic diversity discrepancies on the measurement of inflectional diversity.

Identificador

http://serval.unil.ch/?id=serval:BIB_EF5F6419C909

isbn:9783110420296 (Online) and 9783110419870 (Print)

doi:10.1515/9783110420296-020

http://www.degruyter.com/

reroid:R008294666

Idioma(s)

en

Publicador

Berlin: De Gruyter

Fonte

Recent Contributions to Quantitative Linguistics

Palavras-Chave #inflectional diversity; mean size of paradigm; MSP; RMSP; lexical diversity; robustness; random sampling
Tipo

info:eu-repo/semantics/bookPart

incollection