Extrinsic dexterity leverages environmental contact to overcome the limitations of prehensile manipulation. However, achieving such dexterity in cluttered scenes remains challenging and underexplored, as it requires selectively exploiting contact among multiple interacting objects with inherently coupled dynamics. Existing approaches lack explicit modeling of such complex dynamics and therefore fall short in non-prehensile manipulation in cluttered environments, which in turn limits their practical applicability in real-world environments. In this paper, we introduce a Dynamics-Aware Policy Learning (DAPL) framework that can facilitate policy learning with a learned representation of contact-induced object dynamics in cluttered environments. This representation is learned through explicit world modeling and used to condition reinforcement learning, enabling extrinsic dexterity to emerge without hand-crafted contact heuristics or complex reward shaping. We evaluate our approach in both simulation and the real world. Our method outperforms prehensile manipulation, human teleoperation, and prior representation-based policies by over 25% in success rate on unseen simulated cluttered scenes with varying densities. The real-world success rate reaches around 50% across 10 cluttered scenes, while a practical grocery deployment further demonstrates robust sim-to-real transfer and applicability.
翻译:外在灵巧性利用环境接触来克服抓取式操作的局限性。然而,在杂乱场景中实现这种灵巧性仍然具有挑战性且研究不足,因为它需要在多个具有内在耦合动力学的交互物体之间选择性地利用接触。现有方法缺乏对此类复杂动力学的显式建模,因此在杂乱环境中的非抓取式操作方面表现不足,这反过来限制了其在现实环境中的实际适用性。本文提出了一种动力学感知策略学习框架,该框架能够通过在杂乱环境中学习接触引发的物体动力学表示来促进策略学习。该表示通过显式的世界建模学习获得,并用于条件化强化学习,从而使外在灵巧性得以涌现,而无需手工设计的接触启发式规则或复杂的奖励塑形。我们在仿真和现实世界中对所提方法进行了评估。在具有不同密度的未见杂乱仿真场景中,我们的方法在成功率上优于抓取式操作、人类遥操作以及先前的基于表示的策略超过25%。在10个杂乱现实场景中,成功率达到了约50%,而一项实际的杂货部署进一步证明了稳健的仿真到现实迁移能力及适用性。