Beyond Trial-and-Error: Predicting User Abandonment After a Moderation Intervention

Current content moderation follows a reactive, trial-and-error approach, where interventions are applied and their effects are only measured post-hoc. In contrast, we introduce a proactive, predictive approach that enables moderators to anticipate the impact of their actions before implementation. We propose and tackle the new task of predicting user abandonment following a moderation intervention. We study the reactions of 16,540 users to a massive ban of online communities on Reddit, training a set of binary classifiers to identify those users who would abandon the platform after the intervention -- a problem of great practical relevance. We leverage a dataset of 13.8 million posts to compute a large and diverse set of 142 features, which convey information about the activity, toxicity, relations, and writing style of the users. We obtain promising results, with the best-performing model achieving micro F1-score = 0.914. Our model shows robust generalizability when applied to users from previously unseen communities. Furthermore, we identify activity features as the most informative predictors, followed by relational and toxicity features, while writing style features exhibit limited utility. Theoretically, our results demonstrate the feasibility of adopting a predictive machine learning approach to estimate the effects of moderation interventions. Practically, this work marks a fundamental shift from reactive to predictive moderation, equipping platform administrators with intelligent tools to strategically plan interventions, minimize unintended consequences, and optimize user engagement.

翻译：当前的内容审核遵循一种反应式的试错方法，即先实施干预措施，事后才评估其效果。相比之下，我们提出了一种主动的预测性方法，使审核者能够在实施前预判其行动的影响。我们提出并解决了预测内容审核干预后用户流失这一新任务。通过研究16,540名用户对Reddit平台大规模封禁在线社区的反应，我们训练了一组二元分类器来识别那些在干预后会离开平台的用户——这是一个具有重大实际意义的问题。我们利用包含1380万条帖子的数据集，计算了142个广泛而多样的特征，这些特征传达了用户的活动性、毒性、社交关系和写作风格信息。我们获得了有希望的结果，表现最佳的模型实现了微观F1分数=0.914。当应用于来自先前未见社区的用户时，我们的模型展现出稳健的泛化能力。此外，我们发现活动性特征是最具信息量的预测因子，其次是关系特征和毒性特征，而写作风格特征的预测效用有限。从理论上讲，我们的结果证明了采用预测性机器学习方法来评估审核干预效果的可行性。从实践角度看，这项工作标志着从反应式审核向预测式审核的根本转变，为平台管理员提供了智能工具，以战略性地规划干预措施、最小化意外后果并优化用户参与度。