A robot deployed in a home over long stretches of time faces a true lifelong learning problem. As it seeks to provide assistance to its users, the robot should leverage any accumulated experience to improve its own knowledge and proficiency. We formalize this setting with a novel formulation of lifelong learning for task and motion planning (TAMP), which endows our learner with the compositionality of TAMP systems. Exploiting the modularity of TAMP, we develop a mixture of generative models that produces candidate continuous parameters for a planner. Whereas most existing lifelong learning approaches determine a priori how data is shared across various models, our approach learns shared and non-shared models and determines which to use online during planning based on auxiliary tasks that serve as a proxy for each model's understanding of a state. Our method exhibits substantial improvements (over time and compared to baselines) in planning success on 2D and BEHAVIOR domains.
翻译:机器人在家庭环境中长时间部署时面临真正的终身学习问题。为了向用户提供协助,机器人应利用积累的经验来提升自身知识与熟练度。我们通过任务与运动规划(TAMP)中终身学习的新颖形式化定义来构建这一场景,该定义赋予学习者TAMP系统的组合性能力。利用TAMP的模块化特性,我们开发了一种生成模型混合体,为规划器生成候选连续参数。与大多数现有终身学习方法预先确定数据如何在各模型间共享不同,我们的方法同时学习共享模型与非共享模型,并在规划过程中基于辅助任务(作为各模型对状态理解程度的代理指标)动态选择使用何种模型。在2D和BEHAVIOR领域上的实验表明,我们的方法在规划成功率上(随时间推移及与基线方法对比)展现出显著提升。