Conditional Neural Expert Processes for Learning from Demonstration

from arxiv, This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible. Submitted to Robotics and Automation Letters on February 13, 2024

Learning from Demonstration (LfD) is a widely used technique for skill acquisition in robotics. However, demonstrations of the same skill may exhibit significant variances, or learning systems may attempt to acquire different means of the same skill simultaneously, making it challenging to encode these motions into movement primitives. To address these challenges, we propose an LfD framework, namely the Conditional Neural Expert Processes (CNEP), that learns to assign demonstrations from different modes to distinct expert networks utilizing the inherent information within the latent space to match experts with the encoded representations. CNEP does not require supervision on which mode the trajectories belong to. Provided experiments on artificially generated datasets demonstrate the efficacy of CNEP. Furthermore, we compare the performance of CNEP with another LfD framework, namely Conditional Neural Movement Primitives (CNMP), on a range of tasks, including experiments on a real robot. The results reveal enhanced modeling performance for movement primitives, leading to the synthesis of trajectories that more accurately reflect those demonstrated by experts, particularly when the model inputs include intersection points from various trajectories. Additionally, CNEP offers improved interpretability and faster convergence by promoting expert specialization. Furthermore, we show that the CNEP model accomplishes obstacle avoidance tasks with a real manipulator when provided with novel start and destination points, in contrast to the CNMP model, which leads to collisions with the obstacle.

翻译：从示范中学习（LfD）是机器人技能获取中广泛使用的技术。然而，同一技能的示范可能表现出显著差异，或者学习系统可能同时获取同一技能的不同方式，这给将运动编码为运动基元带来了挑战。为解决这些问题，我们提出了一种LfD框架，即条件神经专家过程（CNEP），该框架利用潜空间中的固有信息，将不同模式的示范分配给不同的专家网络，使专家与编码表示相匹配。CNEP不需要对轨迹所属模式进行监督。在人工生成数据集上进行的实验证明了CNEP的有效性。此外，我们在多项任务（包括真实机器人实验）上将CNEP的性能与另一种LfD框架——条件神经运动基元（CNMP）进行了比较。结果表明，运动基元的建模性能得到提升，尤其是在模型输入包含来自不同轨迹的交点时，生成的轨迹能够更准确地反映专家演示的轨迹。此外，CNEP通过促进专家专业化，提供了更好的可解释性和更快的收敛速度。进一步表明，与导致与障碍物碰撞的CNMP模型不同，CNEP模型在提供新的起点和终点时，能够使用真实机械臂完成避障任务。