Humboldt-Universität zu Berlin - Sprach- und literaturwissenschaftliche Fakultät - Institut für Slawistik und Hungarologie

Vortrag Jiří Milička (Kolloquium Slawistische Linguistik)

Fr 24.6., 12:15-13:45: Jiří Milička (Prag): Measuring lexical diversity: The influence of lemmatization

There is no clear choice how lexical diversity should be measured – there are dozens of metrics and several methods of text-length normalization. While these options are vastly discussed in literature, little is known about the influence of lemmatization. As recent studies suggest that the lexical diversity indices of lemmatized texts better represent some intuitive subjective notion of lexical diversity (Jarvis – Hazhangmoto, 2021), we aim to explore the topic in both English (representing typologically analytical languages) and Czech (representing morphologically rich Slavic languages).