Imitation learning is a promising paradigm for training robot agents; however, standard approaches typically require substantial data acquisition -- via numerous demonstrations or random exploration -- to ensure reliable performance. Although exploration reduces human effort, it lacks safety guarantees and often results in frequent collisions -- particularly in clearance-limited tasks (e.g., peg-in-hole) -- thereby, necessitating manual environmental resets and imposing additional human burden. This study proposes Self-Augmented Robot Trajectory (SART), a framework that enables policy learning from a single human demonstration, while safely expanding the dataset through autonomous augmentation. SART consists of two stages: (1) human teaching only once, where a single demonstration is provided and precision boundaries -- represented as spheres around key waypoints -- are annotated, followed by one environment reset; (2) robot self-augmentation, where the robot generates diverse, collision-free trajectories within these boundaries and reconnects to the original demonstration. This design improves the data collection efficiency by minimizing human effort while ensuring safety. Extensive evaluations in simulation and real-world manipulation tasks show that SART achieves substantially higher success rates than policies trained solely on human-collected demonstrations. Video results available at https://sites.google.com/view/sart-il .
翻译:模仿学习是训练机器人智能体的一种前景广阔的方法;然而,标准方法通常需要大量数据采集——通过多次演示或随机探索——以确保可靠的性能。尽管探索减少了人力投入,但它缺乏安全性保证,并常常导致频繁的碰撞——尤其是在间隙受限的任务中(例如,孔轴装配)——因此需要人工重置环境,增加了额外的人力负担。本研究提出了自增强机器人轨迹(SART),这是一个能够从单次人类演示中学习策略,同时通过自主增强安全扩展数据集的框架。SART包含两个阶段:(1)仅一次的人类示教,即提供单次演示并标注精度边界——表示为关键路径点周围的球体——随后进行一次环境重置;(2)机器人自增强,即机器人在这些边界内生成多样化、无碰撞的轨迹,并重新连接到原始演示。这种设计通过最小化人力投入并确保安全性,提高了数据收集效率。在仿真和现实世界操作任务中的广泛评估表明,SART的成功率远高于仅基于人类收集的演示进行训练的策