Menu
  • Home
  • Research outputs
    • Publications
    • Software and Resources
    • Other Research Outputs
  • Research projects
    • What the swearword?!
    • Automatic Compound Processing
    • Closely-related languages
    • Genre Classification
    • Bibliography of Afrikaans morphology
  • Teaching
    • Diagrams
    • Videos
    • Lectures
    • Fun
  • Public Media
  • Podcast: #HoeNou?!
  • CV & Profiles
  • Contact Me

Tag: stoplist

Van Huyssteen 2021

The Afrikaans stoplist for corpus research (version 1.0) comprises a master list of 1,298 items, based on frequency counts in the Taalkommissie corpus 1.1. The list has been curated based on relative word frequency classes, and Zipf values. In addition, each item has been categorised in terms of length, typecase, selection category, lexical type (i.e. content or function word), and part-of-speech category. For ease of use, three subsets of the main list is also provided.

Read More ยป
Search
Keywords
2003 2005 2007 2008 2010 2011 2012 2014 2015 2016 2017 2018 2020 2021 Afrikaans BLaRK closely-related languages cognitive grammar cognitive linguistics compound compound splitting computational linguistics construction grammar corpus linguistics Dutch English human language technology human language technology audit language audit language resource infrastructure language resource management language resources machine learning machine translation morphological analysis morphology orthography POS tagging recycling resource-scarce languages South Africa spelling checker swearing technology audit terminology
Twitter Facebook-f Youtube Pinterest Instagram

Research Outputs

  • Publications
  • Software and resources
  • Other research outputs

Current projects

  • Swearword project

Other

  • Teaching
  • Public media
  • Podcast: Hoe Nou?!

About

  • CV & profiles
  • Contact me