The task of cultivating healthy communication in online communities becomes increasingly urgent, as gaming and social media experiences become progressively more immersive and life-like. We approach the challenge of moderating online communities by training student models using a large language model (LLM). We use zero-shot learning models to distill and expand datasets followed by a few-shot learning and a fine-tuning approach, leveraging open-access generative pre-trained transformer models (GPT) from OpenAI. Our preliminary findings suggest, that when properly trained, LLMs can excel in identifying actor intentions, moderating toxic comments, and rewarding positive contributions. The student models perform above-expectation in non-contextual assignments such as identifying classically toxic behavior and perform sufficiently on contextual assignments such as identifying positive contributions to online discourse. Further, using open-access models like OpenAI's GPT we experience a step-change in the development process for what has historically been a complex modeling task. We contribute to the information system (IS) discourse with a rapid development framework on the application of generative AI in content online moderation and management of culture in decentralized, pseudonymous communities by providing a sample model suite of industrial-ready generative AI models based on open-access LLMs.
翻译:培育健康在线社区交流的任务日益紧迫,随着游戏和社交媒体体验愈发沉浸式且贴近现实生活。我们通过使用大型语言模型(LLM)训练学生模型来应对在线社区内容审核的挑战。采用零样本学习模型对数据集进行提炼和扩展,继而运用少样本学习与微调方法,利用OpenAI提供的开放式生成式预训练Transformer模型(GPT)。初步研究结果表明,经过适当训练后,LLM能够出色地识别行为主体意图、审核有害言论并奖励积极贡献。学生模型在非语境化任务(如识别典型恶意行为)中表现超预期,在语境化任务(如识别对在线讨论的积极贡献)中也能胜任。此外,通过使用OpenAI GPT等开放模型,我们在开发流程上实现了阶跃式提升——而这一任务历来属于复杂建模范畴。我们通过提供一套基于开放型LLM、可直接部署的工业级生成式AI模型套件,为信息系统(IS)领域关于生成式AI在在线内容审核及去中心化匿名社区文化管理中的应用研究贡献了快速开发框架。