In this paper, we conduct an empirical analysis of how large language models (LLMs), specifically GPT-4, interpret constitutional principles in complex decision-making scenarios. We examine rulings from the Italian Constitutional Court on bioethics issues that involve trade-offs between competing values and compare model-generated legal arguments on these issues to those presented by the State, the Court, and the applicants. Our results indicate that GPT-4 consistently aligns more closely with progressive interpretations of the Constitution, often overlooking competing values and mirroring the applicants' views rather than the more conservative perspectives of the State or the Court's moderate positions. Our experiments reveal a distinct tendency of GPT-4 to favor progressive legal interpretations, underscoring the influence of underlying data biases. We thus underscore the importance of testing alignment in real-world scenarios and considering the implications of deploying LLMs in decision-making processes.
翻译:本文对大型语言模型(特别是GPT-4)在复杂决策情境中解释宪法原则的方式进行了实证分析。我们选取意大利宪法法院在涉及相互竞争价值权衡的生命伦理问题上的判决作为研究对象,将模型针对这些问题生成的法律论证与意大利政府、宪法法院及申请方提出的论证进行对比。研究结果表明,GPT-4始终更倾向于对宪法进行进步主义解释,常常忽视竞争性价值,其论证立场更接近申请方观点,而非政府相对保守的立场或法院的温和立场。实验揭示了GPT-4明显倾向于支持进步主义法律解释的倾向,这凸显了底层数据偏见的影响。因此,我们强调在现实场景中测试模型对齐性的重要性,并需审慎考虑在决策过程中部署大型语言模型可能产生的影响。