A robot deployed in a home over long stretches of time faces a true lifelong learning problem. As it seeks to provide assistance to its users, the robot should leverage any accumulated experience to improve its own knowledge to become a more proficient assistant. We formalize this setting with a novel lifelong learning problem formulation in the context of learning for task and motion planning (TAMP). Exploiting the modularity of TAMP systems, we develop a generative mixture model that produces candidate continuous parameters for a planner. Whereas most existing lifelong learning approaches determine a priori how data is shared across task models, our approach learns shared and non-shared models and determines which to use online during planning based on auxiliary tasks that serve as a proxy for each model's understanding of a state. Our method exhibits substantial improvements in planning success on simulated 2D domains and on several problems from the BEHAVIOR benchmark.
翻译:在家庭环境中长期部署的机器人面临真正的终身学习问题。为向用户提供帮助,机器人应利用积累的经验来提升自身知识,从而成为更熟练的助手。我们在任务与运动规划(TAMP)的学习背景下,通过一种新颖的终身学习问题形式化描述来定义该场景。利用TAMP系统的模块化特性,我们开发了一种生成式混合模型,该模型可为规划器生成候选连续参数。与大多数现有终身学习方法预先确定任务模型间数据共享方式不同,我们的方法学习共享与非共享模型,并基于作为各状态理解代理任务的辅助任务,在规划过程中在线决定使用何种模型。我们的方法在模拟2D域及BEHAVIOR基准测试的多个问题上,展现了规划成功率的显著提升。