Going PLACES: Participatory Localized Red Teaming for Text-to-Image Safety in the Global South

Charvi Rastogi,Mukul Bhutani,Minsuk Kahng,Shamsuddeen Hassan Muhammad,Evgeniia Razumovskaia,Priyanka Suresh,Ibrahim Said Ahmad,Charu Kalia,Yaaseen Mahomed,Madhurima Maji,Minjae Lee,Alicia Parrish,Jessica Quaye,Vijay Janapa Reddi,Aishwarya Verma,Lora Aroyo

from arxiv, Published at ACM Conference on FAccT 2026

Despite the global deployment of text-to-image (T2I) models, their safety frameworks are largely calibrated to a Western-centric default, creating significant vulnerabilities for the rest of the world. To embrace cultural pluralism and bring historically under-represented perspectives in T2I safety, we conduct localised community-centered red teaming studies in the Global South. Our two-fold approach prioritizes localization and participation, by focusing on secondary urban centers in these regions, and conducting community engagement and training workshops to contextualize local norms. As a result, we present PLACES, a dataset comprising over 26,000 examples of T2I model failures collected in partnership with universities in Ghana, Nigeria, and two regions of India (Karnataka and Punjab). Analysis of prompts collected reveals a wide-ranging diversity in socio-cultural and linguistic attributes, when compared to existing geography-agnostic crowdsourced red-teaming data. We observe unique adversarial patterns enabled by local cultural and linguistic nuances, and distinct clusters within region around specific themes, such as religion in India. Moreover, we uncover structural contextual gaps in existing safety frameworks by identifying novel harms showing normative dissonance (e.g., violating religious norms, ignoring local customs, and ominous symbolism). This work argues that expanding T2I safety requires moving beyond mere scale to incorporate deeply localised, participatory methodologies for data collection and contextualization. Content warning: This paper includes examples containing potentially harmful or offensive content.

翻译：尽管文本到图像（T2I）模型已在全球部署，但其安全框架主要校准于西方中心化默认设置，给世界其他地区带来了显著漏洞。为拥抱文化多元性并将历史上被边缘化的视角纳入T2I安全考量，我们在全球南方开展了以社区为中心的本土化红队测试研究。我们的双重方法优先考虑本地化和参与性，聚焦于这些地区的二级城市中心，并通过社区参与和培训工作坊来语境化本地规范。由此，我们提出了PLACES数据集，包含与加纳、尼日利亚及印度两个地区（卡纳塔克邦和旁遮普邦）高校合作收集的26,000多个T2I模型失败案例。与现有忽略地理差异的众包红队测试数据相比，收集到的提示词分析显示出显著的社会文化及语言属性多样性。我们观察到由当地文化和语言细微差异引发的独特对抗模式，以及围绕特定主题（如印度的宗教问题）形成的区域聚类。此外，通过识别显示规范性失调的新型危害（如违反宗教规范、忽视当地习俗及不祥象征），我们揭露了现有安全框架中的结构性语境缺口。本工作论证，扩展T2I安全需超越单纯规模扩大，转向深度融合本土化的参与式数据收集与语境化方法。内容警告：本文包含可能含有有害或冒犯性内容的示例。