This paper develops a Distributed Differentiable Dynamic Game (D3G) framework, which enables learning multi-robot coordination from demonstrations. We represent multi-robot coordination as a dynamic game, where the behavior of a robot is dictated by its own dynamics and objective that also depends on others' behavior. The coordination thus can be adapted by tuning the objective and dynamics of each robot. The proposed D3G enables each robot to automatically tune its individual dynamics and objectives in a distributed manner by minimizing the mismatch between its trajectory and demonstrations. This learning framework features a new design, including a forward-pass, where all robots collaboratively seek Nash equilibrium of a game, and a backward-pass, where gradients are propagated via the communication graph. We test the D3G in simulation with two types of robots given different task configurations. The results validate the capability of D3G for learning multi-robot coordination from demonstrations.
翻译:本文提出了一种分布式可微动态博弈(D3G)框架,该框架能够从示教中学习多机器人协调。我们将多机器人协调建模为动态博弈,其中每个机器人的行为由其自身动力学和目标函数决定,且该目标函数同时也依赖于其他机器人的行为。因此,通过调整每个机器人的目标函数和动力学,可以适应性地实现协调。所提出的D3G框架使每个机器人能够以分布式方式自动调整其个体动力学和目标函数,通过最小化其轨迹与示教轨迹之间的偏差实现。该学习框架包含一种新型设计:前向传播阶段中,所有机器人协作求解博弈的纳什均衡;反向传播阶段中,梯度通过通信图进行传播。我们在仿真环境中针对两类机器人、不同任务配置对D3G进行了测试。结果验证了D3G从示教中学习多机器人协调的能力。