CasIL: Cognizing and Imitating Skills via a Dual Cognition-Action Architecture

Enabling robots to effectively imitate expert skills in longhorizon tasks such as locomotion, manipulation, and more, poses a long-standing challenge. Existing imitation learning (IL) approaches for robots still grapple with sub-optimal performance in complex tasks. In this paper, we consider how this challenge can be addressed within the human cognitive priors. Heuristically, we extend the usual notion of action to a dual Cognition (high-level)-Action (low-level) architecture by introducing intuitive human cognitive priors, and propose a novel skill IL framework through human-robot interaction, called Cognition-Action-based Skill Imitation Learning (CasIL), for the robotic agent to effectively cognize and imitate the critical skills from raw visual demonstrations. CasIL enables both cognition and action imitation, while high-level skill cognition explicitly guides low-level primitive actions, providing robustness and reliability to the entire skill IL process. We evaluated our method on MuJoCo and RLBench benchmarks, as well as on the obstacle avoidance and point-goal navigation tasks for quadrupedal robot locomotion. Experimental results show that our CasIL consistently achieves competitive and robust skill imitation capability compared to other counterparts in a variety of long-horizon robotic tasks.

翻译：使机器人能够有效模仿长时域任务（如运动、操作等）中专家技能，是一项长期存在的挑战。现有机器人模仿学习方法在复杂任务中仍存在性能次优的问题。本文探讨如何借助人类认知先验应对这一挑战。通过引入直观的人类认知先验，我们将常规的动作概念扩展为双层级结构——认知（高层级）与动作（低层级），并据此提出一种基于人机交互的新型技能模仿学习框架，称为"基于认知-动作的技能模仿学习"（CasIL），使机器人代理能够从原始视觉演示中有效认知并模仿关键技能。CasIL同时实现了认知模仿与动作模仿，高层级技能认知明确指导低层级原始动作，为整个技能模仿过程提供鲁棒性与可靠性。我们在MuJoCo和RLBench基准测试，以及四足机器人运动中的避障与点目标导航任务上评估了该方法。实验结果表明，在多种长时域机器人任务中，CasIL相比其他同类方法始终展现出具有竞争力的鲁棒技能模仿能力。

相关内容

Cognition

关注 4

Cognition：Cognition：International Journal of Cognitive Science Explanation：认知：国际认知科学杂志。 Publisher：Elsevier。 SIT： http://www.journals.elsevier.com/cognition/

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日