Recent advances in NLP show that language models retain a discernible level of knowledge in deontological ethics and moral norms. However, existing works often treat morality as binary, ranging from right to wrong. This simplistic view does not capture the nuances of moral judgment. Pluralist moral philosophers argue that human morality can be deconstructed into a finite number of elements, respecting individual differences in moral judgment. In line with this view, we build a pluralist moral sentence embedding space via a state-of-the-art contrastive learning approach. We systematically investigate the embedding space by studying the emergence of relationships among moral elements, both quantitatively and qualitatively. Our results show that a pluralist approach to morality can be captured in an embedding space. However, moral pluralism is challenging to deduce via self-supervision alone and requires a supervised approach with human labels.
翻译:自然语言处理的最新进展表明,语言模型在道义伦理和道德规范方面保留了一定程度的知识。然而,现有研究常将道德视为从正确到错误的二元对立,这种简化观点无法捕捉道德判断的细微差异。多元主义道德哲学家认为,人类道德可被解构为有限数量的基本要素,从而尊重个体道德判断的差异性。基于这一观点,我们通过最先进的对比学习方法构建了多元主义道德句子嵌入空间。通过定量分析与定性研究相结合的方式,系统探究了该嵌入空间中道德要素间关联关系的涌现机制。实验结果表明,道德多元主义确实能在嵌入空间中得到表征。但仅凭自监督学习难以推导出道德多元主义,需要结合人工标注的监督方法才能实现。