Embodied agents operate in a structured world, often solving tasks with spatial, temporal, and permutation symmetries. Most algorithms for planning and model-based reinforcement learning (MBRL) do not take this rich geometric structure into account, leading to sample inefficiency and poor generalization. We introduce the Equivariant Diffuser for Generating Interactions (EDGI), an algorithm for MBRL and planning that is equivariant with respect to the product of the spatial symmetry group SE(3), the discrete-time translation group Z, and the object permutation group Sn. EDGI follows the Diffuser framework (Janner et al., 2022) in treating both learning a world model and planning in it as a conditional generative modeling problem, training a diffusion model on an offline trajectory dataset. We introduce a new SE(3)xZxSn-equivariant diffusion model that supports multiple representations. We integrate this model in a planning loop, where conditioning and classifier guidance let us softly break the symmetry for specific tasks as needed. On object manipulation and navigation tasks, EDGI is substantially more sample efficient and generalizes better across the symmetry group than non-equivariant models.
翻译:具身智能体在结构化世界中运行,往往需要处理具有空间、时间及置换对称性的任务。大多数规划和基于模型的强化学习(MBRL)算法并未考虑这种丰富的几何结构,导致样本效率低下且泛化能力差。我们提出等变交互生成扩散器(EDGI),这是一种对空间对称群SE(3)、离散时间平移群Z和物体置换群Sn的乘积具有等变性的MBRL与规划算法。EDGI遵循Diffuser框架(Janner等,2022),将世界模型学习与规划均视为条件生成建模问题,在离线轨迹数据集上训练扩散模型。我们提出了一种支持多重表示的SE(3)×Z×Sn等变扩散模型,并将其集成至规划循环中——通过条件生成与分类器引导,可根据具体任务需求柔和地打破对称性。在物体操作与导航任务中,EDGI相比非等变模型展现出显著更高的样本效率和更强的跨对称群泛化能力。