Scalable and Efficient Continual Learning from Demonstration via Hypernetwork-generated Stable Dynamics Model

Learning from demonstration (LfD) provides an efficient way to train robots. The learned motions should be convergent and stable, but to be truly effective in the real world, LfD-capable robots should also be able to remember multiple motion skills. Multi-skill retention is a capability missing from existing stable-LfD approaches. On the other hand, recent work on continual-LfD has shown that hypernetwork-generated neural ordinary differential equation solvers, can learn multiple LfD tasks sequentially, but this approach lacks stability guarantees. We propose an approach for stable continual-LfD in which a hypernetwork generates two networks: a trajectory learning dynamics model, and a trajectory stabilizing Lyapunov function. The introduction of stability not only generates stable trajectories but also greatly improves continual learning performance, especially in the size-efficient chunked hypernetworks. With our approach, we can continually train a single model to predict the position and orientation trajectories of the robot's end-effector simultaneously for multiple real world tasks without retraining on past demonstrations. We also propose stochastic regularization with a single randomly sampled regularization term in hypernetworks, which reduces the cumulative training time cost for $N$ tasks from $\mathcal{O}(N^2)$ to $\mathcal{O}(N)$ without any loss in performance in real-world tasks. We empirically evaluate our approach on the popular LASA dataset, on high-dimensional extensions of LASA (including up to 32 dimensions) to assess scalability, and on a novel extended robotic task dataset (RoboTasks9) to assess real-world performance. In trajectory error metrics, stability metrics and continual learning metrics our approach performs favorably, compared to other baselines. Code and datasets will be shared after submission.

翻译：从演示中学习（LfD）为训练机器人提供了高效途径。学习到的运动应具备收敛性与稳定性，但要在现实世界中真正有效，具备LfD能力的机器人还应能记忆多种运动技能。现有稳定LfD方法缺失多技能保留能力。另一方面，近期持续LfD研究表明，超网络生成的神经常微分方程求解器虽能顺序学习多个LfD任务，但该方法缺乏稳定性保证。我们提出一种稳定的持续LfD方法，其中超网络生成两个网络：轨迹学习动力学模型与轨迹稳定李雅普诺夫函数。引入稳定性不仅生成稳定轨迹，还大幅提升持续学习性能，尤其在尺寸高效的块状超网络中。利用该方法，我们可连续训练单个模型，同时预测机器人末端执行器在多个现实任务中的位置与姿态轨迹，且无需对先前演示进行重训练。我们还提出针对超网络的随机正则化方法，采用单一随机采样正则化项，将N个任务的累积训练时间复杂度从O(N²)降至O(N)，且不损失现实任务性能。我们在主流LASA数据集、LASA的高维扩展（最高32维）以评估可扩展性，以及新型扩展机器人任务数据集RoboTasks9上评估现实性能。与基线方法相比，本方法在轨迹误差指标、稳定性指标和持续学习指标上均表现更优。代码与数据集将在投稿后公开。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【WSDM2020】超越统计关系：将知识关系整合到多标签音乐风格分类的风格关联中（附pdf）

专知会员服务

18+阅读 · 2019年11月23日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日