灰色地带：Reddit平台中版主分歧的特征分析 (The Gray Area: Characterizing Moderator Disagreement on Reddit)

Volunteer moderators play a crucial role in sustaining online dialogue, but they often disagree about what should or should not be allowed. In this paper, we study the complexity of content moderation with a focus on disagreements between moderators, which we term the ``gray area'' of moderation. Leveraging 5 years and 4.3 million moderation log entries from 24 subreddits of different topics and sizes, we characterize how gray area, or disputed cases, differ from undisputed cases. We show that one-in-seven moderation cases are disputed among moderators, often addressing transgressions where users' intent is not directly legible, such as in trolling and brigading, as well as tensions around community governance. This is concerning, as almost half of all gray area cases involved automated moderation decisions. Through information-theoretic evaluations, we demonstrate that gray area cases are inherently harder to adjudicate than undisputed cases and show that state-of-the-art language models struggle to adjudicate them. We highlight the key role of expert human moderators in overseeing the moderation process and provide insights about the challenges of current moderation processes and tools.

翻译：志愿版主在维持在线对话中扮演着关键角色，但他们对于哪些内容应被允许或禁止常存在分歧。本文聚焦于版主之间的分歧（我们称之为内容审核的“灰色地带”），以探讨内容审核的复杂性。通过分析来自24个不同主题与规模的subreddit、历时五年共计430万条审核日志，我们系统描述了灰色地带（即争议案例）与非争议案例的差异。研究表明，每七个审核案例中就有一个存在版主争议，这些争议常涉及用户意图难以直接判定的违规行为（如钓鱼引战和跨区骚扰），以及围绕社区治理的紧张态势。值得注意的是，近半数的灰色地带案例涉及自动化审核决策，这引发了新的担忧。通过信息论评估，我们证明灰色地带案例本质上比非争议案例更难裁决，且当前最先进的语言模型在处理此类案例时仍面临困难。本文强调了专业人工版主在监督审核流程中的核心作用，并对当前审核流程与工具面临的挑战提出了深刻见解。