Van Huyssteen, Gerhard B., and Martin J. Puttkammer. 2007. “Accelerating the annotation of lexical data for less-resourced languages.” Proceedings of the 8th Annual Conference of the International Speech Communication Association (Interspeech 2007):1505-1508.
Van Huyssteen & Puttkammer 2007
Abstract
The development of digital resources is an expensive and time-consuming endeavor; especially in the case of less-resourced languages. In this paper; we describe a freely available; open-source system; called TurboAnnotate; for bootstrapping linguistic data for machine-learning purposes, or for manually creating gold standards or other annotated lists. A detailed description of the design and functionalities of the tool is given, focusing on how the requirements of end-users are being addressed through it. It is indicated that TurboAnnotate does not only promise to help increase the accuracy of human annotators, but also to save enormously on human effort in terms of time.
Written in:
English
Dealing with:
Afrikaans and Setswana
Keywords
machine learning, bootstrapping, linguistic data, TurboAnnotate, Afrikaans, Setswana
Afrikaans keywords
Afrikaans, masjienleer, Setswana, skoenlussteekproefneming, taalkundige data, TurboAnnotate