Learning collaborative behaviors is essential for multi-agent systems. Traditionally, multi-agent reinforcement learning solves this implicitly through a joint reward and centralized observations, assuming collaborative behavior will emerge. Other studies propose to learn from demonstrations of a group of collaborative experts. Instead, we propose an efficient and explicit way of learning collaborative behaviors in multi-agent systems by leveraging expertise from only a single human. Our insight is that humans can naturally take on various roles in a team. We show that agents can effectively learn to collaborate by allowing a human operator to dynamically switch between controlling agents for a short period and incorporating a human-like theory-of-mind model of teammates. Our experiments showed that our method improves the success rate of a challenging collaborative hide-and-seek task by up to 58$% with only 40 minutes of human guidance. We further demonstrate our findings transfer to the real world by conducting multi-robot experiments.
翻译:学习协作行为对于多智能体系统至关重要。传统上,多智能体强化学习通过联合奖励和集中观测隐含地解决此问题,假设协作行为会自然涌现。其他研究提出从一组协作专家的示范中学习。相反,我们提出一种高效且显式的方法,通过仅利用单个人类的专业知识来学习多智能体系统中的协作行为。我们的核心见解是:人类能够自然地承担团队中的不同角色。我们证明,通过允许人类操作者在短时间内动态切换所控制的智能体,并结合一种类人的队友心智理论模型,智能体能够有效地学会协作。实验表明,我们的方法仅需40分钟的人类指导,就能将一项具有挑战性的协作捉迷藏任务的成功率提升高达58%。我们进一步通过多机器人实验验证了所提方法能够迁移到现实世界。