Due to the complex interactions between agents, learning multi-agent control policy often requires a prohibited amount of data. This paper aims to enable multi-agent systems to effectively utilize past memories to adapt to novel collaborative tasks in a data-efficient fashion. We propose the Multi-Agent Coordination Skill Database, a repository for storing a collection of coordinated behaviors associated with key vectors distinctive to them. Our Transformer-based skill encoder effectively captures spatio-temporal interactions that contribute to coordination and provides a unique skill representation for each coordinated behavior. By leveraging only a small number of demonstrations of the target task, the database enables us to train the policy using a dataset augmented with the retrieved demonstrations. Experimental evaluations demonstrate that our method achieves a significantly higher success rate in push manipulation tasks compared with baseline methods like few-shot imitation learning. Furthermore, we validate the effectiveness of our retrieve-and-learn framework in a real environment using a team of wheeled robots.
翻译:由于智能体间复杂的交互作用,学习多智能体控制策略通常需要海量数据。本文旨在使多智能体系统能够高效利用历史记忆,以数据高效的方式适应新型协作任务。我们提出了多智能体协调技能数据库,该存储库用于存储与特定关键向量相关联的协调行为集合。我们基于Transformer的技能编码器能有效捕捉促成协调的时空交互,并为每个协调行为提供独特的技能表征。通过仅利用目标任务的少量演示样本,该数据库使我们能够使用经检索演示增强的数据集来训练策略。实验评估表明,与少样本模仿学习等基线方法相比,我们的方法在推操作任务中实现了显著更高的成功率。此外,我们通过轮式机器人团队在真实环境中验证了所提检索-学习框架的有效性。