Value Engineering for Autonomous Agents

Machine Ethics (ME) is concerned with the design of Artificial Moral Agents (AMAs), i.e. autonomous agents capable of reasoning and behaving according to moral values. Previous approaches have treated values as labels associated with some actions or states of the world, rather than as integral components of agent reasoning. It is also common to disregard that a value-guided agent operates alongside other value-guided agents in an environment governed by norms, thus omitting the social dimension of AMAs. In this blue sky paper, we propose a new AMA paradigm grounded in moral and social psychology, where values are instilled into agents as context-dependent goals. These goals intricately connect values at individual levels to norms at a collective level by evaluating the outcomes most incentivized by the norms in place. We argue that this type of normative reasoning, where agents are endowed with an understanding of norms' moral implications, leads to value-awareness in autonomous agents. Additionally, this capability paves the way for agents to align the norms enforced in their societies with respect to the human values instilled in them, by complementing the value-based reasoning on norms with agreement mechanisms to help agents collectively agree on the best set of norms that suit their human values. Overall, our agent model goes beyond the treatment of values as inert labels by connecting them to normative reasoning and to the social functionalities needed to integrate value-aware agents into our modern hybrid human-computer societies.

翻译：机器伦理学（Machine Ethics, ME）关注人工道德代理（Artificial Moral Agents, AMAs）的设计，即能够根据道德价值观进行推理和行为的自主代理。此前的处理方法将价值观视为与某些行动或世界状态相关联的标签，而非代理推理的组成部分。此外，人们通常忽视一个事实：受价值观引导的代理是在受规范约束的环境中与其他同样受价值观引导的代理共同运作的，从而忽略了AMAs的社会维度。在这篇"蓝天"论文中，我们提出了一种基于道德与社会心理学的新AMAs范式，其中价值观作为情境依赖的目标被植入代理中。这些目标通过评估现有规范最激励的结果，将个体层面的价值观与集体层面的规范紧密联系起来。我们认为，这种规范性推理——即代理被赋予理解规范道德含义的能力——能够在自主代理中实现价值意识。此外，这种能力通过为基于价值的规范推理补充协商机制，帮助代理集体商定最符合其人类价值观的最佳规范集，从而为代理调整其社会中所执行的规范以使其符合被植入的人类价值观铺平道路。总体而言，我们的代理模型超越了将价值观视为惰性标签的处理方式，通过将其与规范性推理及整合价值感知代理所必需的社会功能相连接，使这类代理能够融入现代混合人机社会。