We present a framework for learning useful subgoals that support efficient long-term planning to achieve novel goals. At the core of our framework is a collection of rational subgoals (RSGs), which are essentially binary classifiers over the environmental states. RSGs can be learned from weakly-annotated data, in the form of unsegmented demonstration trajectories, paired with abstract task descriptions, which are composed of terms initially unknown to the agent (e.g., collect-wood then craft-boat then go-across-river). Our framework also discovers dependencies between RSGs, e.g., the task collect-wood is a helpful subgoal for the task craft-boat. Given a goal description, the learned subgoals and the derived dependencies facilitate off-the-shelf planning algorithms, such as A* and RRT, by setting helpful subgoals as waypoints to the planner, which significantly improves performance-time efficiency.
翻译:我们提出一个用于学习有效子目标的框架,这些子目标能够支持高效长期规划以实现新目标。该框架的核心是一组理性子目标(RSGs),本质上是对环境状态的二元分类器。RSGs可通过弱标注数据(形式为未分割的演示轨迹)结合抽象任务描述进行学习,其中任务描述包含智能体初始未知的术语(例如:collect-wood → craft-boat → go-across-river)。我们的框架还能发现RSGs之间的依赖关系,例如任务collect-wood是任务craft-boat的有效子目标。当给定目标描述时,学习到的子目标及其衍生依赖关系可通过将有效子目标设为路径点来支持A*和RRT等现成规划算法,从而显著提升运行时效能。