Diffusion policies (DPs) achieve state-of-the-art performance on complex manipulation tasks by learning from large-scale demonstration datasets, often spanning multiple embodiments and environments. However, they cannot guarantee safe behavior, requiring external safety mechanisms. These, however, alter actions in ways unseen during training, causing unpredictable behavior and performance degradation. To address these problems, we propose path-consistent safety filtering (PACS) for DPs. Our approach performs path-consistent braking on a trajectory computed from the sequence of generated actions. In this way, we keep the execution consistent with the training distribution of the policy, maintaining the learned, task-completing behavior. To enable real-time deployment and handle uncertainties, we verify safety using set-based reachability analysis. Our experimental evaluation in simulation and on three challenging real-world human-robot interaction tasks shows that PACS (a) provides formal safety guarantees in dynamic environments, (b) preserves task success rates, and (c) outperforms reactive safety approaches, such as control barrier functions, by up to 68 % in terms of task success. Videos are available at our project website: https://tum-lsy.github.io/pacs.
翻译:扩散策略通过从大规模演示数据集(通常涵盖多种实现方式和环境)中学习,在复杂操作任务上实现了最先进的性能。然而,它们无法保证行为的安全性,需要外部安全机制。然而,这些机制会以训练过程中未见的方式改变动作,导致不可预测的行为和性能下降。为解决这些问题,我们提出了用于扩散策略的路径一致安全过滤方法。我们的方法对基于生成动作序列计算出的轨迹执行路径一致制动。通过这种方式,我们使执行过程与策略的训练分布保持一致,从而保持已学习的任务完成行为。为实现实时部署并处理不确定性,我们使用基于集合的可达性分析来验证安全性。我们在仿真和三个具有挑战性的真实世界人机交互任务中的实验评估表明,PACS(a)在动态环境中提供形式化的安全保证,(b)保持任务成功率,并且(c)在任务成功率方面优于反应式安全方法(如控制屏障函数),最高可提升68%。相关视频可在我们的项目网站查看:https://tum-lsy.github.io/pacs。