Diffusion policies are a powerful paradigm for robot learning, but their training is often inefficient. A key reason is that networks must relearn fundamental spatial concepts, such as translations and rotations, from scratch for every new task. To alleviate this redundancy, we propose embedding geometric inductive biases directly into the network architecture using Projective Geometric Algebra (PGA). PGA provides a unified algebraic framework for representing geometric primitives and transformations, allowing neural networks to reason about spatial structure more effectively. In this paper, we introduce hPGA-DP, a novel hybrid diffusion policy that capitalizes on these benefits. Our architecture leverages the Projective Geometric Algebra Transformer (P-GATr) as a state encoder and action decoder, while employing established U-Net or Transformer-based modules for the core denoising process. Through extensive experiments and ablation studies in both simulated and real-world environments, we demonstrate that hPGA-DP significantly improves task performance and training efficiency. Notably, our hybrid approach achieves substantially faster convergence compared to both standard diffusion policies and architectures that rely solely on P-GATr. The project website is available at: https://apollo-lab-yale.github.io/26-ICRA-hPGA-website/.
翻译:扩散策略是机器人学习的一种强大范式,但其训练过程往往效率低下。一个关键原因在于,网络必须为每个新任务从头开始重新学习基本的空间概念,如平移和旋转。为减轻这种冗余,我们提出使用射影几何代数将几何归纳偏置直接嵌入网络架构。PGA 为表示几何基元与变换提供了一个统一的代数框架,使神经网络能够更有效地推理空间结构。本文中,我们介绍了 hPGA-DP,一种利用这些优势的新型混合扩散策略。我们的架构采用射影几何代数 Transformer 作为状态编码器和动作解码器,同时使用成熟的 U-Net 或基于 Transformer 的模块作为核心去噪过程。通过在仿真和真实环境中的大量实验与消融研究,我们证明 hPGA-DP 显著提升了任务性能和训练效率。值得注意的是,与标准扩散策略以及完全依赖 P-GATr 的架构相比,我们的混合方法实现了显著更快的收敛速度。项目网站地址为:https://apollo-lab-yale.github.io/26-ICRA-hPGA-website/。