流利但异质：即便是区域性大语言模型也缺乏文化对齐 (Fluent but Foreign: Even Regional LLMs Lack Cultural Alignment) - 专知论文

会员服务 ·

0

语言模型 · 对齐 · 异质 · 构建 · 大语言模型 ·

Fluent but Foreign: Even Regional LLMs Lack Cultural Alignment

翻译：流利但异质：即便是区域性大语言模型也缺乏文化对齐

Dhruv Agarwal,Anya Shukla,Sunayana Sitaram,Aditya Vashistha

from arxiv, Under review

Large language models (LLMs) are used worldwide, yet exhibit Western cultural tendencies. Many countries are now building ``regional'' or ``sovereign'' LLMs, but it remains unclear whether they reflect local values and practices or merely speak local languages. Using India as a case study, we evaluate six Indic and six global LLMs on two dimensions -- values and practices -- grounded in nationally representative surveys and community-sourced QA datasets. Across tasks, Indic models do not align better with Indian norms than global models; in fact, a U.S. respondent is a closer proxy for Indian values than any Indic model. We further run a user study with 115 Indian users and find that writing suggestions from both global and Indic LLMs introduce Westernized or exoticized writing. Prompting and regional fine-tuning fail to recover alignment and can even degrade existing knowledge. We attribute this to scarce culturally grounded data, especially for pretraining. We position cultural evaluation as a first-class requirement alongside multilingual benchmarks and offer a reusable, community-grounded methodology. We call for native, community-authored corpora and thickxwide evaluations to build truly sovereign LLMs.

翻译：大语言模型（LLMs）在全球范围内得到应用，却呈现出西方文化倾向。许多国家正在构建“区域性”或“主权性”大语言模型，但这些模型是否真正反映了本土价值观与实践，抑或仅能使用当地语言，尚不明确。本研究以印度为例，基于全国代表性调查和社区众包问答数据集，从价值观与实践两个维度评估了六个印度本土模型和六个全球模型。在所有任务中，印度本土模型并未比全球模型更贴合印度规范；事实上，美国受访者比任何印度本土模型都更接近印度价值观的代理。我们进一步对115名印度用户开展用户研究，发现无论是全球还是印度本土大语言模型提供的写作建议，均会引入西化或异域化的表达方式。提示工程与区域性微调未能恢复文化对齐，甚至可能损害现有知识。我们将此归因于缺乏文化根基的数据，特别是在预训练阶段。本研究主张将文化评估确立为与多语言基准同等重要的一级需求，并提供了一套可复用、扎根社区的方法论。我们呼吁构建由本土社区创作的语料库，并开展深入广泛的评估，以建设真正具有主权性的大语言模型。

0

相关内容

语言模型

大型语言模型的规模效应局限

大型语言模型的规模效应局限

专知会员服务

14+阅读 · 2025年11月18日

【伯克利博士论文】语言模型的脆弱性

【伯克利博士论文】语言模型的脆弱性

专知会员服务

23+阅读 · 2025年2月20日

扩展英语大语言模型到新语言的综述

扩展英语大语言模型到新语言的综述

专知会员服务

18+阅读 · 2024年8月15日

大型语言模型（LLMs），附Slides与视频

大型语言模型（LLMs），附Slides与视频

专知会员服务

70+阅读 · 2024年6月30日

「大型语言模型评测」综述

「大型语言模型评测」综述

专知会员服务

70+阅读 · 2024年3月30日

大模型幻觉如何克服？哈工大等最新《大型语言模型中的幻觉现象》综述，详述原理、分类、挑战与未解之谜

大模型幻觉如何克服？哈工大等最新《大型语言模型中的幻觉现象》综述，详述原理、分类、挑战与未解之谜

专知会员服务

82+阅读 · 2023年11月12日

天大最新《大型语言模型评估》全面综述，111页pdf

天大最新《大型语言模型评估》全面综述，111页pdf

专知会员服务

88+阅读 · 2023年10月31日

大型语言模型对齐

大型语言模型对齐

专知会员服务

119+阅读 · 2023年9月27日

大模型如何可解释？新泽西理工学院等最新《大型语言模型可解释性》综述

大模型如何可解释？新泽西理工学院等最新《大型语言模型可解释性》综述

专知会员服务

98+阅读 · 2023年9月11日

如何评估大模型？吉大微软亚研等最新《大型语言模型评估》综述，全面阐述大模型评估方法体系

如何评估大模型？吉大微软亚研等最新《大型语言模型评估》综述，全面阐述大模型评估方法体系

专知会员服务

88+阅读 · 2023年7月13日

194篇文献调研ChatGPT最新研究进展！最新《ChatGPT/GPT-4研究综述及对大型语言模型未来的展望》国内外研究者编著

194篇文献调研ChatGPT最新研究进展！最新《ChatGPT/GPT-4研究综述及对大型语言模型未来的展望》国内外研究者编著

专知

25+阅读 · 2023年4月7日

从T5到GPT-4最新最全梳理，人大等《大型语言模型综述》，51页pdf详述大模型进展

从T5到GPT-4最新最全梳理，人大等《大型语言模型综述》，51页pdf详述大模型进展

专知

25+阅读 · 2023年4月4日

语义鸿沟、异构鸿沟、数据缺失，多模态技术如何跨过这些坎？

语义鸿沟、异构鸿沟、数据缺失，多模态技术如何跨过这些坎？

AI前线

15+阅读 · 2019年3月21日

中文对比英文自然语言处理NLP的区别综述

中文对比英文自然语言处理NLP的区别综述

AINLP

18+阅读 · 2019年3月20日

NLP Chinese Corpus：大规模中文自然语言处理语料

NLP Chinese Corpus：大规模中文自然语言处理语料

PaperWeekly

14+阅读 · 2019年2月18日

中文NLP福利！大规模中文自然语言处理语料

中文NLP福利！大规模中文自然语言处理语料

新智元

37+阅读 · 2019年2月13日

NLP Chinese Corpus项目：大规模中文自然语言处理语料

NLP Chinese Corpus项目：大规模中文自然语言处理语料

AINLP

13+阅读 · 2019年2月11日

放弃 RNN/LSTM 吧，因为真的不好用！望周知~

放弃 RNN/LSTM 吧，因为真的不好用！望周知~

人工智能头条

19+阅读 · 2018年4月24日

语料库构建——自然语言理解的基础

语料库构建——自然语言理解的基础

计算机研究与发展

11+阅读 · 2017年8月21日

Word2Vec 与 GloVe 技术浅析与对比

Word2Vec 与 GloVe 技术浅析与对比

LibRec智能推荐

25+阅读 · 2017年5月15日

中国地区生产率差距研究——基于异质性企业、劳动力与产业空间分布的视角

国家自然科学基金

1+阅读 · 2015年12月31日

第二语言韵律焦点产出、合成与评价的研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于形态和多词的有限语料蒙汉互译调序优化方法

国家自然科学基金

0+阅读 · 2015年12月31日

基于犹豫模糊语言信息的定性决策理论与方法

国家自然科学基金

2+阅读 · 2015年12月31日

强调与对比影响语篇理解的认知过程及其神经机制

国家自然科学基金

4+阅读 · 2015年12月31日

中文社交化短文本情感分析与话题挖掘研究

国家自然科学基金

3+阅读 · 2015年12月31日

全球化时代跨界民族国家认同的心理机制研究——以维吾尔族和哈萨克族为例

国家自然科学基金

2+阅读 · 2014年12月31日

多域网络安全的异构策略语义形态与验证机制

国家自然科学基金

0+阅读 · 2014年12月31日

官员异质性、社会信任与企业资源配置

国家自然科学基金

0+阅读 · 2014年12月31日

维吾尔语单元集优化关键技术研究及其在语音识别中的应用

国家自然科学基金

0+阅读 · 2014年12月31日

The Roots of Performance Disparity in Multilingual Language Models: Intrinsic Modeling Difficulty or Design Choices?

Arxiv

0+阅读 · 2月7日

Overstating Attitudes, Ignoring Networks: LLM Biases in Simulating Misinformation Susceptibility

Arxiv

0+阅读 · 2月4日

The Language You Ask In: Language-Conditioned Ideological Divergence in LLM Analysis of Contested Political Documents

Arxiv

0+阅读 · 2月2日

Programming Language Confusion: When Code LLMs Can't Keep their Languages Straight

Arxiv

0+阅读 · 2月2日

Bias Beyond Borders: Political Ideology Evaluation and Steering in Multilingual LLMs

Arxiv

0+阅读 · 2月2日

Regional Bias in Large Language Models

Arxiv

0+阅读 · 1月22日

LLMs Homogenize Values in Constructive Arguments on Value-Laden Topics

Arxiv

0+阅读 · 1月22日

Entangled in Representations: Mechanistic Investigation of Cultural Biases in Large Language Models

Arxiv

0+阅读 · 1月16日

Knowledge Homophily in Large Language Models

Arxiv

0+阅读 · 1月15日

Testing Low-Resource Language Support in LLMs Using Language Proficiency Exams: the Case of Luxembourgish

Arxiv

0+阅读 · 1月15日

VIP会员

文章信息

相关主题

大语言模型

相关VIP内容

大型语言模型的规模效应局限

大型语言模型的规模效应局限

专知会员服务

14+阅读 · 2025年11月18日

【伯克利博士论文】语言模型的脆弱性

【伯克利博士论文】语言模型的脆弱性

专知会员服务

23+阅读 · 2025年2月20日

扩展英语大语言模型到新语言的综述

扩展英语大语言模型到新语言的综述

专知会员服务

18+阅读 · 2024年8月15日

大型语言模型（LLMs），附Slides与视频

大型语言模型（LLMs），附Slides与视频

专知会员服务

70+阅读 · 2024年6月30日

「大型语言模型评测」综述

「大型语言模型评测」综述

专知会员服务

70+阅读 · 2024年3月30日

大模型幻觉如何克服？哈工大等最新《大型语言模型中的幻觉现象》综述，详述原理、分类、挑战与未解之谜

大模型幻觉如何克服？哈工大等最新《大型语言模型中的幻觉现象》综述，详述原理、分类、挑战与未解之谜

专知会员服务

82+阅读 · 2023年11月12日

天大最新《大型语言模型评估》全面综述，111页pdf

天大最新《大型语言模型评估》全面综述，111页pdf

专知会员服务

88+阅读 · 2023年10月31日

大型语言模型对齐

大型语言模型对齐

专知会员服务

119+阅读 · 2023年9月27日

大模型如何可解释？新泽西理工学院等最新《大型语言模型可解释性》综述

大模型如何可解释？新泽西理工学院等最新《大型语言模型可解释性》综述

专知会员服务

98+阅读 · 2023年9月11日

如何评估大模型？吉大微软亚研等最新《大型语言模型评估》综述，全面阐述大模型评估方法体系

如何评估大模型？吉大微软亚研等最新《大型语言模型评估》综述，全面阐述大模型评估方法体系

专知会员服务

88+阅读 · 2023年7月13日

热门VIP内容

开通专知VIP会员享更多权益服务

美国防部门开始扩建金穹反导系统基础设施

《基于选择性深度神经网络分类的弹性无线通信》最新报告

《多域作战中融合网络、电子战与动能机动》

《在东欧磨砺反无人机技能》美陆军最新反无人机训练报告

相关资讯

194篇文献调研ChatGPT最新研究进展！最新《ChatGPT/GPT-4研究综述及对大型语言模型未来的展望》国内外研究者编著

194篇文献调研ChatGPT最新研究进展！最新《ChatGPT/GPT-4研究综述及对大型语言模型未来的展望》国内外研究者编著

专知

25+阅读 · 2023年4月7日

从T5到GPT-4最新最全梳理，人大等《大型语言模型综述》，51页pdf详述大模型进展

从T5到GPT-4最新最全梳理，人大等《大型语言模型综述》，51页pdf详述大模型进展

专知

25+阅读 · 2023年4月4日

语义鸿沟、异构鸿沟、数据缺失，多模态技术如何跨过这些坎？

语义鸿沟、异构鸿沟、数据缺失，多模态技术如何跨过这些坎？

AI前线

15+阅读 · 2019年3月21日

中文对比英文自然语言处理NLP的区别综述

中文对比英文自然语言处理NLP的区别综述

AINLP

18+阅读 · 2019年3月20日

NLP Chinese Corpus：大规模中文自然语言处理语料

NLP Chinese Corpus：大规模中文自然语言处理语料

PaperWeekly

14+阅读 · 2019年2月18日

中文NLP福利！大规模中文自然语言处理语料

中文NLP福利！大规模中文自然语言处理语料

新智元

37+阅读 · 2019年2月13日

NLP Chinese Corpus项目：大规模中文自然语言处理语料

NLP Chinese Corpus项目：大规模中文自然语言处理语料

AINLP

13+阅读 · 2019年2月11日

放弃 RNN/LSTM 吧，因为真的不好用！望周知~

放弃 RNN/LSTM 吧，因为真的不好用！望周知~

人工智能头条

19+阅读 · 2018年4月24日

语料库构建——自然语言理解的基础

语料库构建——自然语言理解的基础

计算机研究与发展

11+阅读 · 2017年8月21日

Word2Vec 与 GloVe 技术浅析与对比

Word2Vec 与 GloVe 技术浅析与对比

LibRec智能推荐

25+阅读 · 2017年5月15日

相关论文

The Roots of Performance Disparity in Multilingual Language Models: Intrinsic Modeling Difficulty or Design Choices?

Arxiv

0+阅读 · 2月7日

Overstating Attitudes, Ignoring Networks: LLM Biases in Simulating Misinformation Susceptibility

Arxiv

0+阅读 · 2月4日

The Language You Ask In: Language-Conditioned Ideological Divergence in LLM Analysis of Contested Political Documents

Arxiv

0+阅读 · 2月2日

Programming Language Confusion: When Code LLMs Can't Keep their Languages Straight

Arxiv

0+阅读 · 2月2日

Bias Beyond Borders: Political Ideology Evaluation and Steering in Multilingual LLMs

Arxiv

0+阅读 · 2月2日

Regional Bias in Large Language Models

Arxiv

0+阅读 · 1月22日

LLMs Homogenize Values in Constructive Arguments on Value-Laden Topics

Arxiv

0+阅读 · 1月22日

Entangled in Representations: Mechanistic Investigation of Cultural Biases in Large Language Models

Arxiv

0+阅读 · 1月16日

Knowledge Homophily in Large Language Models

Arxiv

0+阅读 · 1月15日

Testing Low-Resource Language Support in LLMs Using Language Proficiency Exams: the Case of Luxembourgish

Arxiv

0+阅读 · 1月15日

相关基金

中国地区生产率差距研究——基于异质性企业、劳动力与产业空间分布的视角

国家自然科学基金

1+阅读 · 2015年12月31日

第二语言韵律焦点产出、合成与评价的研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于形态和多词的有限语料蒙汉互译调序优化方法

国家自然科学基金

0+阅读 · 2015年12月31日

基于犹豫模糊语言信息的定性决策理论与方法

国家自然科学基金

2+阅读 · 2015年12月31日

强调与对比影响语篇理解的认知过程及其神经机制

国家自然科学基金

4+阅读 · 2015年12月31日

中文社交化短文本情感分析与话题挖掘研究

国家自然科学基金

3+阅读 · 2015年12月31日

全球化时代跨界民族国家认同的心理机制研究——以维吾尔族和哈萨克族为例

国家自然科学基金

2+阅读 · 2014年12月31日

多域网络安全的异构策略语义形态与验证机制

国家自然科学基金

0+阅读 · 2014年12月31日

官员异质性、社会信任与企业资源配置

国家自然科学基金

0+阅读 · 2014年12月31日

维吾尔语单元集优化关键技术研究及其在语音识别中的应用

国家自然科学基金

0+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员