Identifying human morals and values embedded in language is essential to empirical studies of communication. However, researchers often face substantial difficulty navigating the diversity of theoretical frameworks and data available for their analysis. Here, we contribute MoVa, a well-documented suite of resources for generalizable classification of human morals and values, consisting of (1) 16 labeled datasets and benchmarking results from four theoretically-grounded frameworks; (2) a lightweight LLM prompting strategy that outperforms fine-tuned models across multiple domains and frameworks; and (3) a new application that helps evaluate psychological surveys. In practice, we specifically recommend a classification strategy, all@once, that scores all related concepts simultaneously, resembling the well-known multi-label classifier chain. The data and methods in MoVa can facilitate many fine-grained interpretations of human and machine communication, with potential implications for the alignment of machine behavior.
翻译:识别语言中蕴含的人类道德与价值观对于传播学的实证研究至关重要。然而,研究人员在驾驭多样化的理论框架和可用数据进行分析时常常面临巨大困难。本文提出MoVa——一套文档完备的资源套件,用于实现人类道德与价值观的泛化分类,其包含:(1) 基于四个理论框架构建的16个标注数据集及基准测试结果;(2) 一种轻量级的大语言模型提示策略,其在多个领域和框架中表现优于微调模型;(3) 一个用于评估心理学量表的新应用工具。在实际应用中,我们特别推荐一种名为all@once的分类策略,该策略可同时对所有相关概念进行评分,其原理类似于经典的多标签分类器链。MoVa中的数据与方法能够促进对人类及机器传播的细粒度解读,并对机器行为的对齐具有潜在启示意义。