Recent advances in multi-agent reinforcement learning (MARL) are enabling impressive coordination in heterogeneous multi-robot teams. However, existing approaches often overlook the challenge of generalizing learned policies to teams of new compositions, sizes, and robots. While such generalization might not be important in teams of virtual agents that can retrain policies on-demand, it is pivotal in multi-robot systems that are deployed in the real-world and must readily adapt to inevitable changes. As such, multi-robot policies must remain robust to team changes -- an ability we call adaptive teaming. In this work, we investigate if awareness and communication of robot capabilities can provide such generalization by conducting detailed experiments involving an established multi-robot test bed. We demonstrate that shared decentralized policies, that enable robots to be both aware of and communicate their capabilities, can achieve adaptive teaming by implicitly capturing the fundamental relationship between collective capabilities and effective coordination. Videos of trained policies can be viewed at: https://sites.google.com/view/cap-comm
翻译:多智能体强化学习的最新进展使异构多机器人团队能够实现引人注目的协同能力。然而,现有方法常忽略将所学策略泛化至新构成、新规模及新机器人团队的挑战。尽管在可按需重新训练策略的虚拟智能体团队中,这种泛化可能无关紧要,但对于部署在现实世界且必须随时适应不可避免变化的多机器人系统而言,它至关重要。因此,多机器人策略必须对团队变化保持鲁棒性——我们将这种能力称为自适应组队。本研究通过基于成熟多机器人测试平台的详细实验,探究机器人能力的感知与通信能否提供这种泛化能力。我们证明,使机器人既能感知自身能力又能进行能力通信的共享分散式策略,可通过隐式捕获集体能力与有效协同之间的基本关系,实现自适应组队。已训练策略的视频可访问:https://sites.google.com/view/cap-comm