Large Language Models (LLMs) are increasingly being deployed in multilingual, multicultural settings, yet their reliance on predominantly English-centric training data risks misalignment with the diverse cultural values of different societies. In this paper, we present a comprehensive, multilingual audit of the cultural alignment of contemporary LLMs including GPT-4o-Mini, Gemini-2.5-Flash, Llama 3.2, Mistral and Gemma 3 across India, East Asia and Southeast Asia. Our study specifically focuses on the sensitive domain of religion as the prism for broader alignment. To facilitate this, we conduct a multi-faceted analysis of every LLM's internal representations, using log-probs/logits, to compare the model's opinion distributions against ground-truth public attitudes. We find that while the popular models generally align with public opinion on broad social issues, they consistently fail to accurately represent religious viewpoints, especially those of minority groups, often amplifying negative stereotypes. Lightweight interventions, such as demographic priming and native language prompting, partially mitigate but do not eliminate these cultural gaps. We further show that downstream evaluations on bias benchmarks (such as CrowS-Pairs, IndiBias, ThaiCLI, KoBBQ) reveal persistent harms and under-representation in sensitive contexts. Our findings underscore the urgent need for systematic, regionally grounded audits to ensure equitable global deployment of LLMs.
翻译:大型语言模型(LLM)越来越多地被部署在多语言、多文化场景中,然而其依赖以英语为主的训练数据,可能导致与不同社会多元文化价值观的错位。本文对包括GPT-4o-Mini、Gemini-2.5-Flash、Llama 3.2、Mistral和Gemma 3在内的当代LLM,在印度、东亚和东南亚地区的文化对齐性进行了全面的多语言审计。我们的研究特别聚焦宗教这一敏感领域,将其作为更广泛对齐的棱镜。为此,我们对每个LLM的内部表征进行了多层面分析,利用对数概率/对数几率(log-probs/logits)将模型的观点分布与真实的公众态度进行比较。我们发现,虽然主流模型在广泛社会议题上通常能与公众舆论保持一致,但它们始终未能准确反映宗教观点,尤其是少数群体的观点,且往往放大负面刻板印象。轻量级干预措施,如人口统计学提示和母语提示,虽能部分缓解但未能消除这些文化差距。我们进一步表明,在偏差基准测试(如CrowS-Pairs、IndiBias、ThaiCLI、KoBBQ)上的下游评估揭示了在敏感语境中持续的伤害和低代表性。我们的研究结果强调了开展系统性、区域化审计的迫切性,以确保LLM的全球公平部署。