Incorporating symmetry as an inductive bias into multi-agent reinforcement learning (MARL) has led to improvements in generalization, data efficiency, and physical consistency. While prior research has succeeded in using perfect symmetry prior, the realm of partial symmetry in the multi-agent domain remains unexplored. To fill in this gap, we introduce the partially symmetric Markov game, a new subclass of the Markov game. We then theoretically show that the performance error introduced by utilizing symmetry in MARL is bounded, implying that the symmetry prior can still be useful in MARL even in partial symmetry situations. Motivated by this insight, we propose the Partial Symmetry Exploitation (PSE) framework that is able to adaptively incorporate symmetry prior in MARL under different symmetry-breaking conditions. Specifically, by adaptively adjusting the exploitation of symmetry, our framework is able to achieve superior sample efficiency and overall performance of MARL algorithms. Extensive experiments are conducted to demonstrate the superior performance of the proposed framework over baselines. Finally, we implement the proposed framework in real-world multi-robot testbed to show its superiority.
翻译:将对称性作为归纳偏置引入多智能体强化学习(MARL)已显著提升泛化能力、数据效率与物理一致性。尽管先前研究成功利用了完美对称性先验,但多智能体领域中部分对称性的问题尚未得到探索。为填补这一空白,我们提出了部分对称马尔可夫博弈——马尔可夫博弈的一个新子类。随后从理论上证明,在MARL中利用对称性引入的性能误差存在上界,这意味着即使在部分对称情形下,对称性先验仍对MARL具有实用价值。基于这一发现,我们提出部分对称性利用(PSE)框架,该框架能在不同对称性破坏条件下自适应地将对称性先验融入MARL。具体而言,通过自适应调整对称性利用程度,该框架能够提升MARL算法的样本效率与整体性能。大量实验表明,所提框架相较于基线方法具有显著性能优势。最后,我们在真实多机器人测试平台上实现了该框架,进一步验证其优越性。