This paper introduces Hierarchical Diffusion Policy (HDP), a hierarchical agent for multi-task robotic manipulation. HDP factorises a manipulation policy into a hierarchical structure: a high-level task-planning agent which predicts a distant next-best end-effector pose (NBP), and a low-level goal-conditioned diffusion policy which generates optimal motion trajectories. The factorised policy representation allows HDP to tackle both long-horizon task planning while generating fine-grained low-level actions. To generate context-aware motion trajectories while satisfying robot kinematics constraints, we present a novel kinematics-aware goal-conditioned control agent, Robot Kinematics Diffuser (RK-Diffuser). Specifically, RK-Diffuser learns to generate both the end-effector pose and joint position trajectories, and distill the accurate but kinematics-unaware end-effector pose diffuser to the kinematics-aware but less accurate joint position diffuser via differentiable kinematics. Empirically, we show that HDP achieves a significantly higher success rate than the state-of-the-art methods in both simulation and real-world.
翻译:本文提出层级扩散策略(HDP),一种面向多任务机器人操控的层级化智能体。HDP将操控策略分解为层级结构:高层任务规划智能体预测远距离最优末端执行器姿态(NBP),低层目标引导扩散策略生成最优运动轨迹。这种因子化策略表示使HDP既能处理长时域任务规划,又能生成细粒度低层动作。为生成情境感知运动轨迹并满足机器人运动学约束,我们提出新型运动学感知目标引导控制智能体——机器人运动学扩散器(RK-Diffuser)。具体而言,RK-Diffuser学习同时生成末端执行器姿态与关节位置轨迹,并通过可微运动学将精确但不具备运动学感知能力的末端执行器姿态扩散器蒸馏至具备运动学感知能力但精度较低的关节位置扩散器。实验表明,在仿真与真实场景中,HDP均取得显著优于现有最优方法的任务成功率。