Large language models (LLMs) have rapidly become indispensable tools for acquiring information and supporting human decision-making. However, ensuring that these models uphold fairness across varied contexts is critical to their safe and responsible deployment. In this study, we undertake a comprehensive examination of four widely adopted LLMs, probing their underlying biases and inclinations across the dimensions of politics, ideology, alliance, language, and gender. Through a series of carefully designed experiments, we investigate their political neutrality using news summarization, ideological biases through news stance classification, tendencies toward specific geopolitical alliances via United Nations voting patterns, language bias in the context of multilingual story completion, and gender-related affinities as revealed by responses to the World Values Survey. Results indicate that while the LLMs are aligned to be neutral and impartial, they still show biases and affinities of different types.
翻译:大型语言模型(LLMs)已迅速成为获取信息和辅助人类决策不可或缺的工具。然而,确保这些模型在不同情境下保持公平性,对其安全可靠的应用至关重要。本研究对四种广泛采用的大型语言模型展开全面检验,从政治、意识形态、联盟、语言和性别等维度探究其内在偏差与倾向。通过一系列精心设计的实验,我们利用新闻摘要任务考察其政治中立性,通过新闻立场分类检测意识形态偏差,借助联合国投票模式分析其对特定地缘政治联盟的倾向性,在多语言故事补全情境中评估语言偏差,并基于对世界价值观调查的响应揭示其性别相关偏好。结果表明,尽管这些大型语言模型被校准为中立且公正,它们仍表现出不同类型的偏差与倾向。