Embodied agents operate in a structured world, often solving tasks with spatial, temporal, and permutation symmetries. Most algorithms for planning and model-based reinforcement learning (MBRL) do not take this rich geometric structure into account, leading to sample inefficiency and poor generalization. We introduce the Equivariant Diffuser for Generating Interactions (EDGI), an algorithm for MBRL and planning that is equivariant with respect to the product of the spatial symmetry group $\mathrm{SE(3)}$, the discrete-time translation group $\mathbb{Z}$, and the object permutation group $\mathrm{S}_n$. EDGI follows the Diffuser framework (Janner et al. 2022) in treating both learning a world model and planning in it as a conditional generative modeling problem, training a diffusion model on an offline trajectory dataset. We introduce a new $\mathrm{SE(3)} \times \mathbb{Z} \times \mathrm{S}_n$-equivariant diffusion model that supports multiple representations. We integrate this model in a planning loop, where conditioning and classifier-based guidance allow us to softly break the symmetry for specific tasks as needed. On navigation and object manipulation tasks, EDGI improves sample efficiency and generalization.
翻译:具身智能体在结构化世界中运行,通常需要解决具有空间、时间与排列对称性的任务。大多数规划与基于模型的强化学习(MBRL)算法未考虑这种丰富的几何结构,导致样本效率低下且泛化能力弱。我们提出等变交互生成扩散器(EDGI),这是一种面向MBRL与规划的算法,在空间对称群$\mathrm{SE(3)}$、离散时间平移群$\mathbb{Z}$与物体置换群$\mathrm{S}_n$的乘积下保持等变性。EDGI遵循Diffuser框架(Janner等,2022),将世界模型学习与基于该模型的规划均视为条件生成建模问题,在离线轨迹数据集上训练扩散模型。我们引入了一种支持多种表示的$\mathrm{SE(3)} \times \mathbb{Z} \times \mathrm{S}_n$等变扩散模型。该模型被集成至规划循环中,通过条件约束与基于分类器的引导机制,可根据特定任务需求软性打破对称性。在导航与物体操控任务中,EDGI提升了样本效率与泛化能力。