Online community moderators often rely on social signals such as whether or not a user has an account or a profile page as clues that users may cause problems. Reliance on these clues can lead to "overprofiling'' bias when moderators focus on these signals but overlook the misbehavior of others. We propose that algorithmic flagging systems deployed to improve the efficiency of moderation work can also make moderation actions more fair to these users by reducing reliance on social signals and making norm violations by everyone else more visible. We analyze moderator behavior in Wikipedia as mediated by RCFilters, a system which displays social signals and algorithmic flags, and estimate the causal effect of being flagged on moderator actions. We show that algorithmically flagged edits are reverted more often, especially those by established editors with positive social signals, and that flagging decreases the likelihood that moderation actions will be undone. Our results suggest that algorithmic flagging systems can lead to increased fairness in some contexts but that the relationship is complex and contingent.
翻译:在线社区版主通常依赖社交信号(例如用户是否拥有账户或个人资料页面)作为判断用户可能引发问题的线索。过度依赖这些线索可能导致"过度画像"偏差,即版主过度关注这些信号而忽视其他用户的违规行为。我们提出,旨在提高审核工作效率的算法标记系统,通过减少对社交信号的依赖并提高所有用户违规行为的可见性,也能使审核行为对这些用户更加公平。我们分析了维基百科中通过RCFilters系统(该系统同时显示社交信号和算法标记)调节的版主行为,并估计了被标记对版主行为产生的因果效应。研究表明,经算法标记的编辑被回退的频率更高,特别是那些来自具有积极社交信号的资深编辑的编辑;同时,标记降低了审核行为被撤销的可能性。我们的研究结果表明,算法标记系统在某些情境下能够提升公平性,但这种关系具有复杂性且受制于具体条件。