The moderation of content on online platforms is usually non-transparent. On Wikipedia, however, this discussion is carried out publicly and the editors are encouraged to use the content moderation policies as explanations for making moderation decisions. Currently, only a few comments explicitly mention those policies -- 20% of the English ones, but as few as 2% of the German and Turkish comments. To aid in this process of understanding how content is moderated, we construct a novel multilingual dataset of Wikipedia editor discussions along with their reasoning in three languages. The dataset contains the stances of the editors (keep, delete, merge, comment), along with the stated reason, and a content moderation policy, for each edit decision. We demonstrate that stance and corresponding reason (policy) can be predicted jointly with a high degree of accuracy, adding transparency to the decision-making process. We release both our joint prediction models and the multilingual content moderation dataset for further research on automated transparent content moderation.
翻译:在线平台的内容审核通常缺乏透明度。然而在维基百科上,这类讨论是公开进行的,且编辑被鼓励使用内容审核政策来解释其审核决策。目前,仅有少数评论明确提及这些政策——英语评论中占20%,而德语和土耳其语评论仅占2%。为协助理解内容审核机制,我们构建了一个包含三种语言编辑讨论及其推理过程的新型多语言数据集。该数据集包含编辑的立场(保留、删除、合并、评论)、陈述理由以及每次编辑决策对应的内容审核政策。我们证明,立场及其相应理由(政策)可被联合预测并达到较高准确率,从而增强决策过程的透明度。我们发布联合预测模型及多语言内容审核数据集,以推动自动化透明内容审核的进一步研究。