Humans demonstrate an impressive ability to acquire and generalize manipulation "tricks." Even from a single demonstration, such as using soup ladles to reach for distant objects, we can apply this skill to new scenarios involving different object positions, sizes, and categories (e.g., forks and hammers). Additionally, we can flexibly combine various skills to devise long-term plans. In this paper, we present a framework that enables machines to acquire such manipulation skills, referred to as "mechanisms," through a single demonstration and self-play. Our key insight lies in interpreting each demonstration as a sequence of changes in robot-object and object-object contact modes, which provides a scaffold for learning detailed samplers for continuous parameters. These learned mechanisms and samplers can be seamlessly integrated into standard task and motion planners, enabling their compositional use.
翻译:人类展现出令人印象深刻的获取和泛化操作"技巧"的能力。即便仅通过一次示范(例如使用汤勺够取远处物体),我们也能将该技能应用于涉及不同物体位置、尺寸和类别(如叉子和锤子)的新场景中。此外,我们还能灵活组合多种技能以制定长期计划。本文提出一种框架,使机器能通过单次示范与自我对弈习得此类操作技能(称为"机制")。核心洞见在于:将每次示范解读为机器人-物体与物体-物体接触模式的序列变化,这为学习连续参数的具体采样器提供了支撑框架。这些习得的机制与采样器可无缝集成至标准任务与运动规划器中,实现其组合式应用。