Understanding and manipulating deformable objects (e.g., ropes and fabrics) is an essential yet challenging task with broad applications. Difficulties come from complex states and dynamics, diverse configurations and high-dimensional action space of deformable objects. Besides, the manipulation tasks usually require multiple steps to accomplish, and greedy policies may easily lead to local optimal states. Existing studies usually tackle this problem using reinforcement learning or imitating expert demonstrations, with limitations in modeling complex states or requiring hand-crafted expert policies. In this paper, we study deformable object manipulation using dense visual affordance, with generalization towards diverse states, and propose a novel kind of foresightful dense affordance, which avoids local optima by estimating states' values for long-term manipulation. We propose a framework for learning this representation, with novel designs such as multi-stage stable learning and efficient self-supervised data collection without experts. Experiments demonstrate the superiority of our proposed foresightful dense affordance. Project page: https://hyperplane-lab.github.io/DeformableAffordance
翻译:理解和操控可变形物体(如绳索和布料)是一项关键但极具挑战性的任务,具有广泛的应用前景。其难点在于可变形物体的复杂状态与动力学特性、多样化构型以及高维动作空间。此外,此类操作任务通常需要多个步骤才能完成,而贪婪策略极易导致局部最优状态。现有研究通常采用强化学习或模仿专家演示来解决该问题,但在建模复杂状态或依赖手工设计的专家策略方面存在局限性。本文研究基于密集视觉可操作性的可变形物体操控方法,并实现对多样化状态的泛化,提出一种新型前瞻性密集可操作性,通过评估状态在长期操作中的价值来避免局部最优。我们提出一个学习该表征的框架,包含多阶段稳定学习和无需专家的高效自监督数据收集等创新设计。实验证明了所提出的前瞻性密集可操作性的优越性。项目页面:https://hyperplane-lab.github.io/DeformableAffordance