Bilevel planning, in which a high-level search over an abstraction of an environment is used to guide low-level decision-making, is an effective approach to solving long-horizon tasks in continuous state and action spaces. Recent work has shown how to enable such bilevel planning by learning action and transition model abstractions in the form of symbolic operators and neural samplers. In this work, we show that existing symbolic operator learning approaches fall short in many natural environments where agent actions tend to cause a large number of irrelevant propositions to change. This is primarily because they attempt to learn operators that optimize the prediction error with respect to observed changes in the propositions. To overcome this issue, we propose to learn operators that only model changes necessary for abstract planning to achieve the specified goal. Experimentally, we show that our approach learns operators that lead to efficient planning across 10 different hybrid robotics domains, including 4 from the challenging BEHAVIOR-100 benchmark, with generalization to novel initial states, goals, and objects.
翻译:双层规划通过高层搜索环境抽象以指导低层决策,是解决连续状态与动作空间中长期任务的有效方法。近期研究通过以符号算子和神经采样器形式学习动作与转移模型抽象,实现了此类双层规划。本文揭示了现有符号算子学习方法在诸多自然环境中存在缺陷——当智能体动作导致大量无关命题改变时尤为明显。其根本原因在于这些方法试图学习能最小化命题变化观测误差的算子。为突破此局限,我们提出仅建模抽象规划达成指定目标所需必要变化的算子学习方法。实验表明,本方法在10个不同混合机器人领域(含BEHAVIOR-100基准中4个最具挑战性领域)均能习得高效规划算子,且对新型初始状态、目标及对象具有泛化能力。