We present a multi-agent decision-making framework for the emergent coordination of autonomous agents whose intents are initially undecided. Dynamic non-cooperative games have been used to encode multi-agent interaction, but ambiguity arising from factors such as goal preference or the presence of multiple equilibria may lead to coordination issues, ranging from the "freezing robot" problem to unsafe behavior in safety-critical events. The recently developed nonlinear opinion dynamics (NOD) provide guarantees for breaking deadlocks. However, choosing the appropriate model parameters automatically in general multi-agent settings remains a challenge. In this paper, we first propose a novel and principled procedure for synthesizing NOD based on the value functions of dynamic games conditioned on agents' intents. In particular, we provide for the two-player two-option case precise stability conditions for equilibria of the game-induced NOD based on the mismatch between agents' opinions and their game values. We then propose an optimization-based trajectory optimization algorithm that computes agents' policies guided by the evolution of opinions. The efficacy of our method is illustrated with a simulated toll station coordination example.
翻译:本文提出一种多智能体决策框架,用于实现初始意图未定的自主智能体的涌现协调。动态非合作博弈已被用于编码多智能体交互,但由目标偏好或多均衡存在等因素引起的歧义可能导致协调问题,从"冻结机器人"问题到安全关键事件中的危险行为。近期发展的非线性意见动态(NOD)理论为打破僵局提供了理论保证,但在一般多智能体场景中自动选择合适的模型参数仍具挑战。本文首先提出一种基于条件化智能体意图的动态博弈价值函数合成NOD的创新性规范化流程。特别地,针对两智能体两选项情形,我们基于智能体意见与其博弈价值之间的不匹配性,给出了博弈诱导NOD均衡的精确稳定性条件。继而提出一种基于优化的轨迹优化算法,通过意见演化引导智能体策略计算。仿真收费站协调实例验证了该方法的有效性。