The rising popularity of ChatGPT and other AI-powered large language models (LLMs) has led to increasing studies highlighting their susceptibility to mistakes and biases. However, most of these studies focus on models trained on English texts. Taking an innovative approach, this study investigates political biases in GPT's multilingual models. We posed the same question about high-profile political issues in the United States and China to GPT in both English and simplified Chinese, and our analysis of the bilingual responses revealed that GPT's bilingual models' political "knowledge" (content) and the political "attitude" (sentiment) are significantly more inconsistent on political issues in China. The simplified Chinese GPT models not only tended to provide pro-China information but also presented the least negative sentiment towards China's problems, whereas the English GPT was significantly more negative towards China. This disparity may stem from Chinese state censorship and US-China geopolitical tensions, which influence the training corpora of GPT bilingual models. Moreover, both Chinese and English models tended to be less critical towards the issues of "their own" represented by the language used, than the issues of "the other." This suggests that GPT multilingual models could potentially develop a "political identity" and an associated sentiment bias based on their training language. We discussed the implications of our findings for information transmission and communication in an increasingly divided world.
翻译:随着ChatGPT等AI驱动的大语言模型日益流行,越来越多研究揭示其易出错和存在偏见的特性。然而,这些研究大多集中于基于英语文本训练的模型。本研究采用创新方法,探究GPT多语言模型中的政治偏见。我们分别用英文和简体中文向GPT提出关于美国和中国重大政治议题的相同问题,通过对双语回应的分析发现:GPT双语模型在中国政治议题上的政治"知识"(内容)和政治"态度"(情感)存在显著不一致。简体中文版GPT不仅倾向于提供亲华信息,而且对中国问题呈现最少的负面情感,而英文版GPT对中国则明显更为负面。这种差异可能源于中国国家审查制度和美中地缘政治紧张局势对GPT双语模型训练语料库的影响。此外,中英文模型对其"自身"语言所代表的国家议题的批判程度均低于对"他者"国家议题的批判程度。这表明GPT多语言模型可能基于其训练语言发展出"政治身份"及相应的情感偏见。我们讨论了这些发现对日益分裂世界中信息传播与沟通的启示。