Large language models (LLMs) are being deployed across the Global South, where everyday use involves low-resource languages, code-mixing, and culturally specific norms. Yet safety pipelines, benchmarks, and alignment still largely target English and a handful of high-resource languages, implicitly assuming safety and factuality ''transfer'' across languages. Evidence increasingly shows they do not. We synthesize recent findings indicating that (i) safety guardrails weaken sharply on low-resource and code-mixed inputs, (ii) culturally harmful behavior can persist even when standard toxicity scores look acceptable, and (iii) English-only knowledge edits and safety patches often fail to carry over to low-resource languages. In response, we outline a practical agenda for researchers and students in the Global South: parameter-efficient safety steering, culturally grounded evaluation and preference data, and participatory workflows that empower local communities to define and mitigate harm. Our aim is to make multilingual safety a core requirement-not an add-on-for equitable AI in underrepresented regions.
翻译:大型语言模型(LLM)正在全球南方地区部署,其日常使用涉及低资源语言、语码混合及文化特定规范。然而,安全防护流程、基准测试和对齐方法仍主要针对英语及少数高资源语言,其隐含假设是安全性与事实性在不同语言间可“迁移”。越来越多的证据表明事实并非如此。我们综合近期研究发现指出:(i)安全护栏在低资源及语码混合输入上显著弱化;(ii)即使标准毒性评分看似可接受,文化有害行为仍可能持续存在;(iii)仅针对英语的知识编辑与安全补丁往往无法迁移至低资源语言。为此,我们为全球南方地区的研究者与学生提出一项实践议程:参数高效的安全导向机制、基于文化的评估与偏好数据,以及赋能本地社区定义与减轻危害的参与式工作流程。我们的目标是将多语言安全确立为核心要求——而非附加项——以促进在代表性不足地区实现公平的人工智能。