Tag: stoplist

Van Huyssteen 2021

The Afrikaans stoplist for corpus research (version 1.0) comprises a master list of 1,298 items, based on frequency counts in the Taalkommissie corpus 1.1. The list has been curated based on relative word frequency classes, and Zipf values. In addition, each item has been categorised in terms of length, typecase, selection category, lexical type (i.e. content or function word), and part-of-speech category. For ease of use, three subsets of the main list is also provided.