This paper presents a novel Learning from Demonstration (LfD) method that uses neural fields to learn new skills efficiently and accurately. It achieves this by utilizing a shared embedding to learn both scene and motion representations in a generative way. Our method smoothly maps each expert demonstration to a scene-motion embedding and learns to model them without requiring hand-crafted task parameters or large datasets. It achieves data efficiency by enforcing scene and motion generation to be smooth with respect to changes in the embedding space. At inference time, our method can retrieve scene-motion embeddings using test time optimization, and generate precise motion trajectories for novel scenes. The proposed method is versatile and can employ images, 3D shapes, and any other scene representations that can be modeled using neural fields. Additionally, it can generate both end-effector positions and joint angle-based trajectories. Our method is evaluated on tasks that require accurate motion trajectory generation, where the underlying task parametrization is based on object positions and geometric scene changes. Experimental results demonstrate that the proposed method outperforms the baseline approaches and generalizes to novel scenes. Furthermore, in real-world experiments, we show that our method can successfully model multi-valued trajectories, it is robust to the distractor objects introduced at inference time, and it can generate 6D motions.
翻译:本文提出了一种新型的示教学习(Learning from Demonstration, LfD)方法,利用神经场高效、精准地学习新技能。该方法通过共享嵌入以生成式方式联合学习场景与运动表征,将每个专家示教平滑映射至场景-运动嵌入空间,无需人工设计任务参数或依赖大规模数据集。通过强制场景与运动生成相对于嵌入空间变化保持平滑性,实现了数据高效性。在推理阶段,该方法可通过测试时优化检索场景-运动嵌入,为新颖场景生成精确运动轨迹。所提方法具有通用性,兼容图像、三维形状及任何可通过神经场建模的场景表征,并能同时生成末端执行器位姿与基于关节角度的轨迹。本文在需要精准运动轨迹生成的任务中评估了该方法,其底层任务参数化基于物体位置与几何场景变化。实验结果表明,所提方法优于基线方法,并可泛化至新颖场景。此外,真实世界实验验证了该方法能成功建模多值轨迹、对推理时引入的干扰物具有鲁棒性,且可生成六维运动。