Van Huyssteen, Gerhard B., and Tiberius, Carole. 2023. Towards a lexical database of Dutch taboo language. Paper presented at eLex 2023, Brno, Czech Republic. https://elex.link/elex2023/
Design document: Version 2.0.0 available for commenting
Van Huyssteen, Gerhard B., and Tiberius, Carole. 2023. Towards a lexical database of Dutch taboo language. Paper presented at eLex 2023, Brno, Czech Republic. https://elex.link/elex2023/
Design document: Version 2.0.0 available for commenting
Over the past 45 years, at least eighteen Dutch paper-based dictionaries of taboo-language (or taboo-related language) have been published (i.e., as visible works of lexicography). However, none of these are available as (linked) lexical data that could be integrated in natural language processing (NLP) tools and applications (i.e., as invisible works of lexicography). In this paper, we describe the development of a comprehensive lexical database of taboo language (LDTL) for Dutch (TaboeLex) that can be integrated in NLP tools and applications. TaboeLex will be made available as open data, i.e., as a freely available, structured, annotated lexicon that can be linked to other data in the future. The paper focusses on the first phase of the project, namely, to define and design TaboeLex.
English
Dutch
Dutch, e-lexicography, lexical database, ontology, profanity, swearing, taboo language
e-Leksikografie, leksikale databasis, Nederlands, ontologie, skel, taboetaal, vloek