Pilon, Van Huyssteen & Augustinus 2010

Pilon, Suléne, Gerhard B. Van Huyssteen, and Liesbeth Augustinus. 2010. “Converting Afrikaans to Dutch for technology recycling.” Proceedings of the 2010 Conference of the Pattern Recognition Association of South Africa:219-224.

English: Afrikaans, closely-related languages, Dutch, human language technology, machine translation, POS tagging, recycling

Afrikaans: Afrikaans, hersiklering, masjienvertaling, mensetaaltegnologie, naby verwante tale, Nederlands, woordsoortetikettering

English: HLT resource development for a resource scarce language (L2) can be expedited by recycling existing technologies for a closely related language (L1). To improve the success of L1 technologies on L2 data, one can convert L2 data to make it appear more L1-like. We explore this possibility by developing an Afrikaans-to-Dutch lexical conversion module and using it as pre-processing step before applying a Dutch part of speech tagger to Afrikaans data. The accuracy of the Dutch tagger increased from 62.6%, when tagging raw Afrikaans data, to 80.6% when tagging converted Afrikaans data. We therefore  conclude that, at least in the case of Dutch and Afrikaans, the use of lexical conversion as a pre-processing step for technology recycling merits further investigation.


Afrikaans: 

In: English

On: Afrikaans and Dutch