Expected Return Symmetries

Symmetry is an important inductive bias that can improve model robustness and generalization across many deep learning domains. In multi-agent settings, a priori known symmetries have been shown to address a fundamental coordination failure mode known as mutually incompatible symmetry breaking; e.g. in a game where two independent agents can choose to move "left'' or "right'', and where a reward of +1 or -1 is received when the agents choose the same action or different actions, respectively. However, the efficient and automatic discovery of environment symmetries, in particular for decentralized partially observable Markov decision processes, remains an open problem. Furthermore, environmental symmetry breaking constitutes only one type of coordination failure, which motivates the search for a more accessible and broader symmetry class. In this paper, we introduce such a broader group of previously unexplored symmetries, which we call expected return symmetries, which contains environment symmetries as a subgroup. We show that agents trained to be compatible under the group of expected return symmetries achieve better zero-shot coordination results than those using environment symmetries. As an additional benefit, our method makes minimal a priori assumptions about the structure of their environment and does not require access to ground truth symmetries.

翻译：对称性是一种重要的归纳偏置，可提升深度学习中诸多领域的模型鲁棒性与泛化能力。在多智能体场景中，先验已知的对称性已被证明能够解决一种称为"相互不兼容对称性破缺"的基本协调失败模式：例如，当两个独立智能体可选择向左或向右移动，且当两者选择相同或不同动作时分别获得+1或-1奖励的情形。然而，如何高效自动地发现环境对称性（尤其针对分散式部分可观测马尔可夫决策过程）仍是开放性问题。此外，环境对称性破缺仅为协调失败类型之一，这促使我们探索更易获取、更广泛的对称类别。本文提出一类尚未被探索的广义对称性，我们称之为"期望收益对称性"，其包含环境对称性作为子群。研究表明，在期望收益对称性群作用下训练得到的兼容智能体，在零样本协调任务中显著优于仅基于环境对称性的方法。作为额外优势，该方法对智能体环境结构的先验假设极少，且无需获取真实对称性信息。

相关内容

对称性破缺

关注 2

对称性破缺是一个跨物理学、生物学、社会学与系统论等学科的概念，狭义简单理解为对称元素的丧失；也可理解为原来具有较高对称性的系统，出现不对称因素，其对称程度自发降低的现象。对称破缺是事物差异性的方式，任何的对称都一定存在对称破缺。对称性是普遍存在于各个尺度下的系统中，有对称性的存在，就必然存在对称性的破缺。对称性破缺也是量子场论的重要概念，指理论的对称性为真空所破坏，对探索宇宙的本原有重要意义。它包含“自发对称性破缺”和“动力学对称性破缺”两种情形。

[ICML 2026] SOLAR：自监督联合学习实现对称多模态检索

专知会员服务

8+阅读 · 5月18日

【牛津大学博士论文】机器学习中的对称性与泛化

专知会员服务

22+阅读 · 2025年1月8日

【纽约大学博士论文】对称神经网络理论，148页pdf

专知会员服务

41+阅读 · 2024年4月4日

【牛津大学博士论文】机器学习中的对称性与泛化，158页pdf

专知会员服务

41+阅读 · 2023年11月27日