While nationality is a pivotal demographic element that enhances the performance of language models, it has received far less scrutiny regarding inherent biases. This study investigates nationality bias in ChatGPT (GPT-3.5), a large language model (LLM) designed for text generation. The research covers 195 countries, 4 temperature settings, and 3 distinct prompt types, generating 4,680 discourses about nationality descriptions in Chinese and English. Automated metrics were used to analyze the nationality bias, and expert annotators alongside ChatGPT itself evaluated the perceived bias. The results show that ChatGPT's generated discourses are predominantly positive, especially compared to its predecessor, GPT-2. However, when prompted with negative inclinations, it occasionally produces negative content. Despite ChatGPT considering its generated text as neutral, it shows consistent self-awareness about nationality bias when subjected to the same pair-wise comparison annotation framework used by human annotators. In conclusion, while ChatGPT's generated texts seem friendly and positive, they reflect the inherent nationality biases in the real world. This bias may vary across different language versions of ChatGPT, indicating diverse cultural perspectives. The study highlights the subtle and pervasive nature of biases within LLMs, emphasizing the need for further scrutiny.
翻译:尽管国籍作为增强语言模型表现的关键人口要素,其固有的偏见却鲜少受到关注。本研究调查了ChatGPT(GPT-3.5)——一个专为文本生成设计的大型语言模型——中的国籍偏见。研究覆盖195个国家、4种温度设置和3种不同提示类型,生成了4,680段关于国籍描述的中英文语篇。我们采用自动化指标分析国籍偏见,并借助专家标注员及ChatGPT自身评估感知到的偏见。结果显示,ChatGPT生成的文本主要呈现积极倾向,尤其相较于其前身GPT-2。然而,当受到负面倾向的提示时,它偶尔会生成负面内容。尽管ChatGPT认为其生成文本中立,但在使用与人类标注员相同的成对比较标注框架时,它显示出对国籍偏见的一致自我意识。总之,虽然ChatGPT生成的文本看似友好积极,却折射出真实世界中固有的国籍偏见。这种偏见可能因ChatGPT的不同语言版本而异,体现多元文化视角。本研究揭示了大型语言模型中偏见具有微妙且普遍的特性,强调需对其展开进一步审视。