SkillWrapper: Generative Predicate Invention for Task-level Planning

Ziyi Yang,Benned Hedegaard,Ahmed Jaafar,Yichen Wei,Skye Thompson,Shreyas S. Raman,Haotian Fu,Stefanie Tellex,George Konidaris,David Paulius,Naman Shah

Generalizing from individual skill executions to solving long-horizon tasks remains a core challenge in building autonomous agents. A promising direction is learning high-level, symbolic abstractions of the low-level skills of the agents, enabling reasoning and planning independent of the low-level state space. Among possible high-level representations, object-centric skill abstraction with symbolic predicates has been proven to be efficient because of its compatibility with domain-independent planners. Recent advances in foundation models have made it possible to generate symbolic predicates that operate on raw sensory inputs, a process we call generative predicate invention, to facilitate downstream abstraction learning. However, it remains unclear which formal properties the learned representations must satisfy, and how they can be learned to guarantee these properties. In this paper, we address both questions by presenting a formal theory of generative predicate invention for skill abstraction, resulting in symbolic operators that can be used for provably sound and complete planning. Within this framework, we propose SkillWrapper, a method that leverages foundation models to actively collect robot data and learn human-interpretable, plannable representations of black-box skills, using only RGB image observations. Our extensive empirical evaluation in simulation and on real robots shows that SkillWrapper learns abstract representations that enable solving unseen, long-horizon tasks in the real world with black-box skills.

翻译：从单个技能执行泛化到解决长时程任务，仍然是构建自主智能体的核心挑战。一种有前景的方向是学习智能体底层技能的高层符号化抽象，从而实现独立于底层状态空间的推理与规划。在可能的高层表示中，具备符号谓词的对象中心化技能抽象因其与领域无关规划器的兼容性而被证明是高效的。基础模型的最新进展使得生成可处理原始感知输入的符号谓词成为可能，我们将这一过程称为生成式谓词发明，以促进下游的抽象学习。然而，学习到的表示必须满足哪些形式化属性，以及如何学习才能保证这些属性，目前仍不清楚。本文通过提出一套用于技能抽象的生成式谓词发明的形式化理论，同时回答了这两个问题，从而得到可用于可证明完备且可靠的规划的符号化操作符。在此框架内，我们提出了SkillWrapper方法，该方法利用基础模型主动收集机器人数据，并仅使用RGB图像观测，学习黑盒技能的人类可解释、可规划的表示。我们在仿真和真实机器人上进行的大量实证评估表明，SkillWrapper学习到的抽象表示能够利用黑盒技能解决现实世界中未见过的长时程任务。