In order to efficiently learn a dynamics model for a task in a new environment, one can adapt a model learned in a similar source environment. However, existing adaptation methods can fail when the target dataset contains transitions where the dynamics are very different from the source environment. For example, the source environment dynamics could be of a rope manipulated in free-space, whereas the target dynamics could involve collisions and deformation on obstacles. Our key insight is to improve data efficiency by focusing model adaptation on only the regions where the source and target dynamics are similar. In the rope example, adapting the free-space dynamics requires significantly fewer data than adapting the free-space dynamics while also learning collision dynamics. We propose a new method for adaptation that is effective in adapting to regions of similar dynamics. Additionally, we combine this adaptation method with prior work on planning with unreliable dynamics to make a method for data-efficient online adaptation, called FOCUS. We first demonstrate that the proposed adaptation method achieves statistically significantly lower prediction error in regions of similar dynamics on simulated rope manipulation and plant watering tasks. We then show on a bimanual rope manipulation task that FOCUS achieves data-efficient online learning, in simulation and in the real world.
翻译:为了高效地在新环境中学习特定任务的动力学模型,可以借鉴在相似源环境中训练的模型进行适应。然而,当目标数据集包含与源环境动力学差异显著的迁移状态时,现有适应方法可能失效。例如,源环境动力学描述的是绳索在自由空间中的操作,而目标动力学可能涉及与障碍物的碰撞及变形。我们的核心见解在于:通过将模型适应聚焦于源环境与目标环境动力学相似的区域,可提升数据效率。在绳索案例中,适应自由空间动力学所需的数据量远少于同时学习碰撞动力学与适应自由空间动力学。本文提出一种适用于相似动力学区域的高效适应新方法。进一步地,我们将该适应方法与基于不可靠动力学规划的现有工作结合,提出数据高效的在线适应方法FOCUS。我们首先通过仿真绳索操作与植物浇灌任务证明,所提适应方法在相似动力学区域实现了统计显著更低的预测误差。随后在双臂绳索操作任务中验证,FOCUS在仿真环境与真实世界中均能实现数据高效的在线学习。