Scalable and Efficient Continual Learning from Demonstration via a Hypernetwork-generated Stable Dynamics Model

Learning from demonstration (LfD) provides an efficient way to train robots. The learned motions should be convergent and stable, but to be truly effective in the real world, LfD-capable robots should also be able to remember multiple motion skills. Existing stable-LfD approaches lack the capability of multi-skill retention. Although recent work on continual-LfD has shown that hypernetwork-generated neural ordinary differential equation solvers (NODE) can learn multiple LfD tasks sequentially, this approach lacks stability guarantees. We propose an approach for stable continual-LfD in which a hypernetwork generates two networks: a trajectory learning dynamics model, and a trajectory stabilizing Lyapunov function. The introduction of stability generates convergent trajectories, but more importantly it also greatly improves continual learning performance, especially in the size-efficient chunked hypernetworks. With our approach, a single hypernetwork learns stable trajectories of the robot's end-effector position and orientation simultaneously, and does so continually for a sequence of real-world LfD tasks without retraining on past demonstrations. We also propose stochastic hypernetwork regularization with a single randomly sampled regularization term, which reduces the cumulative training time cost for N tasks from O$(N^2)$ to O$(N)$ without any loss in performance on real-world tasks. We empirically evaluate our approach on the popular LASA dataset, on high-dimensional extensions of LASA (including up to 32 dimensions) to assess scalability, and on a novel extended robotic task dataset (RoboTasks9) to assess real-world performance. In trajectory error metrics, stability metrics and continual learning metrics our approach performs favorably, compared to other baselines. Our open-source code and datasets are available at https://github.com/sayantanauddy/clfd-snode.

翻译：示教学习（LfD）为机器人训练提供了高效途径。习得动作应具备收敛性与稳定性，但要在真实世界中真正有效，具备LfD能力的机器人还需能记忆多种运动技能。现有稳定LfD方法缺乏多技能保留能力。尽管近期持续LfD研究表明，超网络生成的神经常微分方程求解器（NODE）能顺序学习多个LfD任务，但该方法缺乏稳定性保障。我们提出一种稳定持续LfD方法，其中超网络生成两个网络：轨迹学习动力学模型与轨迹稳定化李雅普诺夫函数。引入稳定性产生收敛轨迹，但更重要的是，其显著提升了持续学习性能，尤其在尺寸高效的分块超网络中。通过我们的方法，单个超网络可同时学习机器人末端执行器位置与方向的稳定轨迹，并能针对一系列真实世界LfD任务持续学习，无需在先前示教上重新训练。我们还提出基于单随机采样正则化项的随机超网络正则化方法，将N个任务的累积训练时间成本从O$(N^2)$降至O$(N)$，且不损失真实世界任务性能。我们在主流LASA数据集、高维LASA扩展（最高32维）评估可扩展性，以及新型扩展机器人任务数据集（RoboTasks9）评估真实世界性能上进行了实证评估。在轨迹误差指标、稳定性指标和持续学习指标上，我们的方法相比其他基线表现更优。开源代码与数据集见https://github.com/sayantanauddy/clfd-snode。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日