Deploying robots in real-world environments, such as households and manufacturing lines, requires generalization across novel task specifications without violating safety constraints. Linear temporal logic (LTL) is a widely used task specification language with a compositional grammar that naturally induces commonalities among tasks while preserving safety guarantees. However, most prior work on reinforcement learning with LTL specifications treats every new task independently, thus requiring large amounts of training data to generalize. We propose LTL-Transfer, a zero-shot transfer algorithm that composes task-agnostic skills learned during training to safely satisfy a wide variety of novel LTL task specifications. Experiments in Minecraft-inspired domains show that after training on only 50 tasks, LTL-Transfer can solve over 90% of 100 challenging unseen tasks and 100% of 300 commonly used novel tasks without violating any safety constraints. We deployed LTL-Transfer at the task-planning level of a quadruped mobile manipulator to demonstrate its zero-shot transfer ability for fetch-and-deliver and navigation tasks.
翻译:在家庭和制造流水线等真实环境中部署机器人,要求其能够泛化至新的任务规约,同时不违反安全约束。线性时序逻辑(LTL)是一种广泛使用的任务规约语言,其组合式语法自然地诱导了任务间的共性,同时保持了安全保障。然而,先前大多数结合LTL规约的强化学习工作都将每个新任务独立处理,因此需要大量训练数据才能实现泛化。我们提出了LTL-Transfer,一种零样本迁移算法,它将在训练期间学习到的任务无关技能组合起来,以安全地满足各种新颖的LTL任务规约。在受《我的世界》启发的领域中的实验表明,仅对50个任务进行训练后,LTL-Transfer能够解决超过90%的100个具有挑战性的未见任务,以及100%的300个常用新任务,且不违反任何安全约束。我们将LTL-Transfer部署在四足移动机械臂的任务规划层,展示了其在取送和导航任务中的零样本迁移能力。