For robots to be useful outside labs and specialized factories we need a way to teach them new useful behaviors quickly. Current approaches lack either the generality to onboard new tasks without task-specific engineering, or else lack the data-efficiency to do so in an amount of time that enables practical use. In this work we explore dense tracking as a representational vehicle to allow faster and more general learning from demonstration. Our approach utilizes Track-Any-Point (TAP) models to isolate the relevant motion in a demonstration, and parameterize a low-level controller to reproduce this motion across changes in the scene configuration. We show this results in robust robot policies that can solve complex object-arrangement tasks such as shape-matching, stacking, and even full path-following tasks such as applying glue and sticking objects together, all from demonstrations that can be collected in minutes.
翻译:为使机器人能够应用于实验室和专用工厂之外的环境,我们需要一种快速教授新实用行为的方法。当前方法要么缺乏通用性,无法在无特定任务工程的情况下处理新任务,要么数据效率不足,无法在实用时间范围内实现目标。本研究探索将密集追踪作为表征载体,以加速并泛化从示范中学习的过程。我们的方法利用追踪任意点模型(Track-Any-Point, TAP)提取示范中相关运动,并参数化低级控制器以在场景配置变化时复现该运动。结果显示,该方法能生成稳健的机器人策略,解决形状匹配、堆叠等复杂物体排列任务,甚至包括涂胶粘合等完整路径跟踪任务,且所有示范仅需数分钟即可采集完成。