Diffusion models have recently gained popularity for policy learning in robotics due to their ability to capture high-dimensional and multimodal distributions. However, diffusion policies are inherently stochastic and typically trained offline, limiting their ability to handle unseen and dynamic conditions where novel constraints not represented in the training data must be satisfied. To overcome this limitation, we propose diffusion predictive control with constraints (DPCC), an algorithm for diffusion-based control with explicit state and action constraints that can deviate from those in the training data. DPCC uses constraint tightening and incorporates model-based projections into the denoising process of a trained trajectory diffusion model. This allows us to generate constraint-satisfying, dynamically feasible, and goal-reaching trajectories for predictive control. We show through simulations of a robot manipulator that DPCC outperforms existing methods in satisfying novel test-time constraints while maintaining performance on the learned control task.
翻译:扩散模型因其能够捕捉高维多模态分布的特性,近年来在机器人策略学习中广受欢迎。然而,扩散策略本质上是随机性的,且通常采用离线训练方式,这限制了其在处理未见动态场景时的能力——此类场景往往需要满足训练数据中未体现的新约束条件。为克服这一局限性,我们提出带约束的扩散预测控制(DPCC),这是一种基于扩散的约束控制算法,能够处理与训练数据存在偏差的显式状态与动作约束。DPCC采用约束紧缩技术,并将基于模型的投影整合到已训练轨迹扩散模型的去噪过程中。这使得我们能够为预测控制生成满足约束、动态可行且抵达目标的轨迹。通过机器人机械臂的仿真实验,我们证明DPCC在满足新测试约束的同时,能在已学习控制任务上保持性能,其表现优于现有方法。