Mod-Guide: An LLM-based Content Moderation Feedback System to Address Insensitive Speech toward Indigenous Ethnic and Religious Minority Communities

Language operates as a mechanism of both marginalization and resistance, especially for minority communities navigating insensitive and harmful speech online. As content moderation increasingly depends on large language models (LLMs), concerns arise about whether these systems can recognize culturally insensitive speech-language that disregards or marginalizes the cultural and religious perspectives of historically underrepresented communities, often through implicit erasure, misrepresentation, or normative framing, rather than overt hostility. Focusing on Bangladesh's Hindu and Chakma communities -- the country's largest religious and Indigenous ethnic minorities, respectively -- this paper investigates the epistemic limits of LLM-based moderation systems and explores methods for incorporating minority perspectives. We co-created a culturally grounded corpus of insensitive speech with community members and integrated their narratives into moderation pipelines using retrieval augmented generation (RAG). Our tool, Mod-Guide, improves LLM sensitivity to minority viewpoints by leveraging contextual cues derived from lived experience. Through mixed-method evaluations involving both minority and majority participants, we demonstrate that RAG-enhanced moderation responses are more contextually accurate and perceived differently across ethnic lines. This work advances research in human-computer interaction, AI ethics, and social computing by foregrounding restorative justice and hermeneutical inclusion in the design of content moderation systems.

翻译：语言既是边缘化的机制，也是抵抗的工具，尤其对在网络上应对不敏感及有害言论的少数群体而言。随着内容审核日益依赖大型语言模型（LLM），人们开始担忧这些系统能否识别文化不敏感言论——即通过隐性抹除、误述或规范性框架（而非公开敌意），忽视或边缘化历史上受代表性不足群体的文化与宗教视角。本文聚焦孟加拉国的印度教与查克马社区（分别作为该国最大的宗教少数群体和原住民少数民族），探究基于LLM的审核系统的认知局限性，并探索纳入少数群体视角的方法。我们与社区成员共同创建了具有文化根基的不敏感言论语料库，并利用检索增强生成（RAG）技术，将他们的叙事融入审核流程。我们开发的工具Mod-Guide通过利用源于生活经验的语境线索，提升LLM对少数群体观点的敏感性。通过涉及少数群体与多数群体参与者的混合方法评估，我们证明了RAG增强的审核响应在语境上更为精准，且不同族群对其感知存在差异。本研究通过将恢复性正义与诠释学包容置于内容审核系统设计的核心，推动了人机交互、人工智能伦理及社会计算领域的研究进展。