The problem of generalization in learning from demonstration (LfD) has received considerable attention over the years, particularly within the context of movement primitives, where a number of approaches have emerged. Recently, two important approaches have gained recognition. While one leverages via-points to adapt skills locally by modulating demonstrated trajectories, another relies on so-called task-parameterized models that encode movements with respect to different coordinate systems, using a product of probabilities for generalization. While the former are well-suited to precise, local modulations, the latter aim at generalizing over large regions of the workspace and often involve multiple objects. Addressing the quality of generalization by leveraging both approaches simultaneously has received little attention. In this work, we propose an interactive imitation learning framework that simultaneously leverages local and global modulations of trajectory distributions. Building on the kernelized movement primitives (KMP) framework, we introduce novel mechanisms for skill modulation from direct human corrective feedback. Our approach particularly exploits the concept of via-points to incrementally and interactively 1) improve the model accuracy locally, 2) add new objects to the task during execution and 3) extend the skill into regions where demonstrations were not provided. We evaluate our method on a bearing ring-loading task using a torque-controlled, 7-DoF, DLR SARA robot.
翻译:从演示中学习(LfD)的泛化问题多年来受到广泛关注,特别是在运动基元领域,已涌现出多种方法。近期,两种重要方法获得认可:一种通过路径点对演示轨迹进行局部调制以适应技能,另一种则依赖所谓的任务参数化模型,该模型在不同坐标系下编码运动,并利用概率乘积实现泛化。前者擅长精确的局部调制,后者旨在工作空间的大范围区域(常涉及多个物体)实现泛化。同时利用这两种方法提升泛化质量的研究尚不充分。本文提出一种交互式模仿学习框架,可同步实现轨迹分布的局部与全局调制。基于核化运动基元(KMP)框架,我们引入了通过人类直接纠错反馈进行技能调制的新机制。该方法特别利用路径点概念,以增量交互方式实现:1)局部提升模型精度,2)在执行过程中向任务添加新物体,3)将技能扩展至未提供演示的区域。我们在扭矩控制的7自由度DLR SARA机器人上,通过轴承套圈装配任务对所提方法进行了验证。