Large Language Models (LLMs) are increasingly being deployed in multilingual, multicultural settings, yet their reliance on predominantly English-centric training data risks misalignment with the diverse cultural values of different societies. In this paper, we present a comprehensive, multilingual audit of the cultural alignment of contemporary LLMs including GPT-4o-Mini, Gemini-2.5-Flash, Llama 3.2, Mistral and Gemma 3 across India, East Asia and Southeast Asia. Our study specifically focuses on the sensitive domain of religion as the prism for broader alignment. To facilitate this, we conduct a multi-faceted analysis of every LLM's internal representations, using log-probs/logits, to compare the model's opinion distributions against ground-truth public attitudes. We find that while the popular models generally align with public opinion on broad social issues, they consistently fail to accurately represent religious viewpoints, especially those of minority groups, often amplifying negative stereotypes. Lightweight interventions, such as demographic priming and native language prompting, partially mitigate but do not eliminate these cultural gaps. We further show that downstream evaluations on bias benchmarks (such as CrowS-Pairs, IndiBias, ThaiCLI, KoBBQ) reveal persistent harms and under-representation in sensitive contexts. Our findings underscore the urgent need for systematic, regionally grounded audits to ensure equitable global deployment of LLMs.
翻译:大语言模型正日益部署于多语言、多文化环境中,然而其对以英语为中心的训练数据的依赖,存在与不同社会多元文化价值观错位的风险。本文对当代LLM(包括GPT-4o-Mini、Gemma-2.5-Flash、Llama 3.2、Mistral和Gemma 3)在印度、东亚和东南亚地区的文化对齐性进行了全面的多语言审计。我们的研究特别聚焦于宗教这一敏感领域,以其作为更广泛对齐问题的棱镜。为此,我们通过对每个LLM内部表示(使用对数概率/逻辑值)进行多维度分析,将模型的观点分布与真实公众态度进行比较。研究发现,尽管主流模型在广泛社会议题上与公众意见大体一致,但其在准确呈现宗教观点(尤其是少数群体的观点)方面持续存在不足,常常放大负面刻板印象。轻量级干预措施(如人口统计提示和母语提示)能部分缓解但无法消除这些文化鸿沟。我们进一步表明,在偏见基准测试(如CrowS-Pairs、IndiBias、ThaiCLI、KoBBQ)上的下游评估揭示了敏感语境中持续存在的危害与代表性不足问题。这些发现凸显了进行系统性、基于地域的审计以确保LLM全球公平部署的紧迫性。