Automated content moderation for collaborative knowledge hubs like Wikipedia or Wikidata is an important yet challenging task due to multiple factors. In this paper, we construct a database of discussions happening around articles marked for deletion in several Wikis and in three languages, which we then use to evaluate a range of LMs on different tasks (from predicting the outcome of the discussion to identifying the implicit policy an individual comment might be pointing to). Our results reveal, among others, that discussions leading to deletion are easier to predict, and that, surprisingly, self-produced tags (keep, delete or redirect) don't always help guiding the classifiers, presumably because of users' hesitation or deliberation within comments.
翻译:针对维基百科或维基数据等协作知识中心的自动化内容审核是一项重要但具有挑战性的任务,其困难源于多重因素。本文构建了一个数据库,收录了多个维基项目中三种语言环境下围绕标记为待删除条目所展开的讨论。基于此数据库,我们评估了一系列语言模型在不同任务上的表现(从预测讨论结果到识别单条评论可能指向的隐含政策)。研究结果表明:与其他类型相比,导致删除的讨论更容易被预测;令人意外的是,用户自生成的标签(保留、删除或重定向)并不总能有效引导分类器,这可能是由于用户在评论中表现出的犹豫或审慎态度所致。