Large-scale moral machine experiment on large language models

The rapid advancement of Large Language Models (LLMs) and their potential integration into autonomous driving systems necessitates understanding their moral decision-making capabilities. While our previous study examined four prominent LLMs using the Moral Machine experimental framework, the dynamic landscape of LLM development demands a more comprehensive analysis. Here, we evaluate moral judgments across 51 different LLMs, including multiple versions of proprietary models (GPT, Claude, Gemini) and open-source alternatives (Llama, Gemma), to assess their alignment with human moral preferences in autonomous driving scenarios. Using a conjoint analysis framework, we evaluated how closely LLM responses aligned with human preferences in ethical dilemmas and examined the effects of model size, updates, and architecture. Results showed that proprietary models and open-source models exceeding 10 billion parameters demonstrated relatively close alignment with human judgments, with a significant negative correlation between model size and distance from human judgments in open-source models. However, model updates did not consistently improve alignment with human preferences, and many LLMs showed excessive emphasis on specific ethical principles. These findings suggest that while increasing model size may naturally lead to more human-like moral judgments, practical implementation in autonomous driving systems requires careful consideration of the trade-off between judgment quality and computational efficiency. Our comprehensive analysis provides crucial insights for the ethical design of autonomous systems and highlights the importance of considering cultural contexts in AI moral decision-making.

翻译：大型语言模型（LLM）的快速发展及其在自动驾驶系统中的潜在应用，要求我们深入理解其道德决策能力。尽管我们先前的研究已利用道德机器实验框架考察了四种主流LLM，但LLM领域的动态发展需要更全面的分析。本研究评估了51种不同LLM的道德判断，包括专有模型（GPT、Claude、Gemini）的多个版本及开源替代模型（Llama、Gemma），以衡量它们在自动驾驶场景中与人类道德偏好的契合度。通过联合分析框架，我们评估了LLM响应与人类在道德困境中偏好的接近程度，并检验了模型规模、更新版本和架构的影响。结果显示，专有模型及参数量超过100亿的开源模型与人类判断表现出相对较高的契合度，其中开源模型的参数量与偏离人类判断的程度呈显著负相关。然而，模型更新并未持续提升与人类偏好的契合度，且许多LLM表现出对特定伦理原则的过度强调。这些发现表明，虽然增加模型规模可能自然催生更类人的道德判断，但在自动驾驶系统中的实际应用仍需审慎权衡判断质量与计算效率。我们的综合分析为自动驾驶系统的伦理设计提供了关键见解，并强调了在人工智能道德决策中纳入文化背景考量的重要性。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【WSDM2020】超越统计关系：将知识关系整合到多标签音乐风格分类的风格关联中（附pdf）

专知会员服务

18+阅读 · 2019年11月23日