Van Huyssteen, Gerhard B. 2021. “Stats calculators: Word frequency classes.” https://gerhard.pro/software/stats-calculators-word-frequency-classes/.
Here I provide two calculators to determine word frequency classes: The one a relative frequency class (N) based on Perkuhn et al. (2012), and the other one a logarithmic Zipfian scale (Z) based on Van Heuven et al. (2014). Both calculators need as input the frequency of the word (or multiword item) F(n) in a corpus. The N calculator also requires:
- the frequency of the most frequent word F(m) in that corpus.
The Zipfian calculator also requires:
- the number of word tokens F(N) in the corpus; and
- the number of word types F(V) in the corpus.
In another post, you can also find an “Afrikaans version” of the calculators below, plus some additional statistics. These calculators already have the frequency of the most frequent word and the number of word types included, based on the frequency counts in the corpora that are available in the VivA Korpusportaal. These frequencies/numbers are updated regularly.
A multitude of online calculators for corpus linguistics are available, such as Lancaster Stats Tools online, and Paul Rayson’s Log-likelihood and effect size calculator (to name but a few).
- Perkuhn, R., Keibel, H. & Kupietz, M. 2012. Korpuslinguistik. Paderborn: Wilhelm Fink Verlag.
- Van Heuven, W. J. B., P. Mandera, E. Keuleers, and M. Brysbaert. 2014. “Subtlex-UK: A new and improved word frequency database for British English.” Quarterly Journal of Experimental Psychology 67: 1176-1190.
In addition to the descriptions by the original authors, you can find descriptions in Afrikaans in the following publications:
- Regarding N: Van Huyssteen, Gerhard B. 2017b. “Die aard, doel en omvang van die Afrikaanse woordelys en spelreëls. Deel 1 [The nature, goal and scope of the Afrikaanse woordelys en spelreëls. Part 1].” Tydskrif vir Geesteswetenskappe 57 (2-1): 323-345. https://doi.org/doi.10.17159/2224-7912/2017/v57n2-1a7.
- Regarding Z: Van Huyssteen, Gerhard B. 2018. “‘n Korpusondersoek na ‘huidiglik’ [A corpus exploration of ‘huidiglik’].” Literator 39 (2): a1527. https://doi.org/https://doi.org/10.4102/lit.v39i2.1527.