Due to the inherent uncertainty in their deformability during motion, previous methods in rope manipulation often require hundreds of real-world demonstrations to train a manipulation policy for each rope, even for simple tasks such as rope goal reaching, which hinder their applications in our ever-changing world. To address this issue, we introduce GenORM, a framework that allows the manipulation policy to handle different deformable ropes with a single real-world demonstration. To achieve this, we augment the policy by conditioning it on deformable rope parameters and training it with a diverse range of simulated deformable ropes so that the policy can adjust actions based on different rope parameters. At the time of inference, given a new rope, GenORM estimates the deformable rope parameters by minimizing the disparity between the grid density of point clouds of real-world demonstrations and simulations. With the help of a differentiable physics simulator, we require only a single real-world demonstration. Empirical validations on both simulated and real-world rope manipulation setups clearly show that our method can manipulate different ropes with a single demonstration and significantly outperforms the baseline in both environments (62% improvement in in-domain ropes, and 15% improvement in out-of-distribution ropes in simulation, 26% improvement in real-world), demonstrating the effectiveness of our approach in one-shot rope manipulation.
翻译:由于绳索在运动过程中固有的形变不确定性,现有绳索操控方法通常需要数百次真实世界演示才能为每根绳索训练出操控策略(即使对于绳索目标到达这类简单任务也是如此),这严重阻碍了其在动态变化环境中的实际应用。为解决该问题,本文提出GenORM框架——该框架允许操控策略仅需一次真实世界演示即可处理不同形变特性的绳索。具体而言,我们通过将策略与形变绳索参数条件化,并利用多种模拟形变绳索进行训练,使策略能够根据不同绳索参数调整动作。在推理阶段,面对新绳索时,GenORM通过最小化真实世界演示与模拟环境中点云网格密度差异来估计形变绳索参数。借助可微分物理模拟器,我们仅需单次真实世界演示即可完成参数估计。在仿真与真实世界绳索操控场景中的实验验证表明,本方法仅需单次演示即可操控不同绳索,并在所有环境中显著优于基线方法(仿真环境下域内绳索操控成功率提升62%,分布外绳索提升15%;真实环境下提升26%),充分证明了本方法在单次绳索操控任务中的有效性。