Learning from demonstration has proved itself useful for teaching robots complex skills with high sample efficiency. However, teaching long-horizon tasks with multiple skills is challenging as deviations tend to accumulate, the distributional shift becomes more evident, and human teachers become fatigued over time, thereby increasing the likelihood of failure. To address these challenges, we introduce $(ST)^2$, a sequential method for learning long-horizon manipulation tasks that allows users to control the teaching flow by specifying key points, enabling structured and incremental demonstrations. Using this framework, we study how users respond to two teaching paradigms: (i) a traditional monolithic approach, in which users demonstrate the entire task trajectory at once, and (ii) a sequential approach, in which the task is segmented and demonstrated step by step. We conducted an extensive user study on the restocking task with $16$ participants in a realistic retail store environment, evaluating the user preferences and effectiveness of the methods. User-level analysis showed superior performance for the sequential approach in most cases (10 users), compared with the monolithic approach (5 users), with one tie. Our subjective results indicate that some teachers prefer sequential teaching -- as it allows them to teach complicated tasks iteratively -- or others prefer teaching in one go due to its simplicity.
翻译:从演示中学习已被证明能以高样本效率教授机器人复杂技能。然而,教授包含多项技能的长期任务具有挑战性,因为偏差容易累积、分布偏移更为明显,且人类教师会随时间推移感到疲劳,从而增加失败的可能性。为应对这些挑战,我们提出了$(ST)^2$——一种学习长时程操作任务的序列化方法,允许用户通过指定关键点来控制教学流程,实现结构化、渐进式的演示。利用该框架,我们研究了用户对两种教学范式的反应:(i)传统的整体式方法,即用户一次性演示整个任务轨迹;(ii)序列化方法,即将任务分段并逐步演示。我们在一个真实的零售店环境中,针对补货任务开展了包含16名参与者的广泛用户研究,评估了用户偏好与方法的有效性。用户层面分析表明,在多数情况下(10名用户),序列化方法相比整体式方法(5名用户)表现出更优性能,另有一例持平。我们的主观结果显示,部分教师偏好序列化教学——因其允许他们迭代式教授复杂任务,而其他教师则因其简便性倾向于一次性完成教学。