A multiagent system is a society of autonomous agents whose interactions can be regulated via social norms. In general, the norms of a society are not hardcoded but emerge from the agents' interactions. Specifically, how the agents in a society react to each other's behavior and respond to the reactions of others determines which norms emerge in the society. We think of these reactions by an agent to the satisfactory or unsatisfactory behaviors of another agent as communications from the first agent to the second agent. Understanding these communications is a kind of social intelligence: these communications provide natural drivers for norm emergence by pushing agents toward certain behaviors, which can become established as norms. Whereas it is well-known that sanctioning can lead to the emergence of norms, we posit that a broader kind of social intelligence can prove more effective in promoting cooperation in a multiagent system. Accordingly, we develop Nest, a framework that models social intelligence via a wider variety of communications and understanding of them than in previous work. To evaluate Nest, we develop a simulated pandemic environment and conduct simulation experiments to compare Nest with baselines considering a combination of three kinds of social communication: sanction, tell, and hint. We find that societies formed of Nest agents achieve norms faster. Moreover, Nest agents effectively avoid undesirable consequences, which are negative sanctions and deviation from goals, and yield higher satisfaction for themselves than baseline agents despite requiring only an equivalent amount of information.
翻译:多智能体系统是由自主智能体组成的社会,其交互可通过社会规范进行调节。通常,社会规范并非预先编码,而是从智能体的交互中涌现。具体而言,社会中智能体如何回应彼此的行为,以及如何响应他人的反馈,决定了哪些规范会在社会中形成。我们将智能体对另一智能体满意或不满意行为的反应,视为前者向后者的通信。理解这些通信是一种社会智能:这类通信通过推动智能体趋向特定行为(这些行为可能固化为规范),为规范涌现提供自然驱动力。尽管众所周知制裁可导致规范涌现,但我们认为更广泛的社会智能能在多智能体系统中更有效地促进合作。为此,我们开发了Nest框架,该框架通过比先前工作更多样化的通信类型及理解方式对社会智能进行建模。为评估Nest,我们构建了模拟疫情环境,并开展仿真实验,将Nest与考虑三类社会通信(制裁、告知与暗示)组合的基线方法进行对比。研究发现,由Nest智能体组成的社会能更快地形成规范。此外,Nest智能体有效规避了不良后果(负面制裁与目标偏离),并且在仅需等量信息的前提下,实现了高于基线智能体的自身满意度。