Large language models (LLMs) have rapidly become indispensable tools for acquiring information and supporting human decision-making. However, ensuring that these models uphold fairness across varied contexts is critical to their safe and responsible deployment. In this study, we undertake a comprehensive examination of four widely adopted LLMs, probing their underlying biases and inclinations across the dimensions of politics, ideology, alliance, language, and gender. Through a series of carefully designed experiments, we investigate their political neutrality using news summarization, ideological biases through news stance classification, tendencies toward specific geopolitical alliances via United Nations voting patterns, language bias in the context of multilingual story completion, and gender-related affinities as revealed by responses to the World Values Survey. Results indicate that while the LLMs are aligned to be neutral and impartial, they still show biases and affinities of different types.
翻译:大型语言模型(LLMs)已迅速成为获取信息和支持人类决策不可或缺的工具。然而,确保这些模型在不同场景中维持公平性,对其安全且负责任的部署至关重要。本研究对四种广泛采用的大型语言模型进行了全面审视,从政治、意识形态、联盟、语言和性别等维度探究其潜在偏见与倾向。通过一系列精心设计的实验,我们借助新闻摘要任务考察其政治中立性,通过新闻立场分类检测意识形态偏见,利用联合国投票模式分析其对特定地缘政治联盟的倾向,在多语言故事补全场景下评估语言偏见,并通过世界价值观调查的响应揭示其性别相关偏好。结果表明,尽管这些大型语言模型被对齐为中立且公正,但它们仍表现出不同类型的偏见与倾向。