Detecting toxicity in online spaces is challenging and an ever more pressing problem given the increase in social media and gaming consumption. We introduce ToxBuster, a simple and scalable model trained on a relatively large dataset of 194k lines of game chat from Rainbow Six Siege and For Honor, carefully annotated for different kinds of toxicity. Compared to the existing state-of-the-art, ToxBuster achieves 82.95% (+7) in precision and 83.56% (+57) in recall. This improvement is obtained by leveraging past chat history and metadata. We also study the implication towards real-time and post-game moderation as well as the model transferability from one game to another.
翻译:在线空间中的毒性检测是一项挑战,随着社交媒体和游戏消费的增长,这一问题愈发紧迫。我们提出ToxBuster,这是一个简单且可扩展的模型,在来自《彩虹六号:围攻》和《荣耀战魂》的19.4万行游戏聊天数据(经过不同毒性类型的细致标注)组成的大规模数据集上训练。与现有最优方法相比,ToxBuster的精确率达到82.95%(提升7个百分点),召回率达到83.56%(提升57个百分点)。该改进通过利用历史聊天记录和元数据实现。我们还研究了该方法对实时与赛后审核的启示,以及模型在游戏间的可迁移性。