Imitation Learning (IL) is a sample efficient paradigm for robot learning using expert demonstrations. However, policies learned through IL suffer from state distribution shift at test time, due to compounding errors in action prediction which lead to previously unseen states. Choosing an action representation for the policy that minimizes this distribution shift is critical in imitation learning. Prior work propose using temporal action abstractions to reduce compounding errors, but they often sacrifice policy dexterity or require domain-specific knowledge. To address these trade-offs, we introduce HYDRA, a method that leverages a hybrid action space with two levels of action abstractions: sparse high-level waypoints and dense low-level actions. HYDRA dynamically switches between action abstractions at test time to enable both coarse and fine-grained control of a robot. In addition, HYDRA employs action relabeling to increase the consistency of actions in the dataset, further reducing distribution shift. HYDRA outperforms prior imitation learning methods by 30-40% on seven challenging simulation and real world environments, involving long-horizon tasks in the real world like making coffee and toasting bread. Videos are found on our website: https://tinyurl.com/3mc6793z
翻译:模仿学习(IL)是一种利用专家示范进行机器人学习的样本高效范式。然而,通过IL习得的策略在测试时会因动作预测的累积误差导致状态分布偏移,从而陷入未见过状态。选择能最小化这种分布偏移的动作表征策略对模仿学习至关重要。先前研究提出使用时间动作抽象来减少累积误差,但这往往牺牲策略灵活性或需要领域特定知识。为平衡这些权衡,我们提出HYDRA方法,该方法利用具有两层动作抽象的混合动作空间:稀疏高层路点与密集低层动作。HYDRA在测试时动态切换动作抽象,实现机器人粗粒度与细粒度的联合控制。此外,HYDRA采用动作重标注技术增强数据集中动作的一致性,进一步减少分布偏移。在涉及现实世界长期任务(如煮咖啡和烤面包)的七个挑战性仿真及真实环境实验中,HYDRA相比先前模仿学习方法性能提升30-40%。相关视频见网站:https://tinyurl.com/3mc6793z