Increasing Depth of Neural Networks for Life-long Learning

Purpose: We propose a novel method for continual learning based on the increasing depth of neural networks. This work explores whether extending neural network depth may be beneficial in a life-long learning setting. Methods: We propose a novel approach based on adding new layers on top of existing ones to enable the forward transfer of knowledge and adapting previously learned representations. We employ a method of determining the most similar tasks for selecting the best location in our network to add new nodes with trainable parameters. This approach allows for creating a tree-like model, where each node is a set of neural network parameters dedicated to a specific task. The Progressive Neural Network concept inspires the proposed method. Therefore, it benefits from dynamic changes in network structure. However, Progressive Neural Network allocates a lot of memory for the whole network structure during the learning process. The proposed method alleviates this by adding only part of a network for a new task and utilizing a subset of previously trained weights. At the same time, we may retain the benefit of PNN, such as no forgetting guaranteed by design, without needing a memory buffer. Results: Experiments on Split CIFAR and Split Tiny ImageNet show that the proposed algorithm is on par with other continual learning methods. In a more challenging setup with a single computer vision dataset as a separate task, our method outperforms Experience Replay. Conclusion: It is compatible with commonly used computer vision architectures and does not require a custom network structure. As an adaptation to changing data distribution is made by expanding the architecture, there is no need to utilize a rehearsal buffer. For this reason, our method could be used for sensitive applications where data privacy must be considered.

翻译：目的：我们提出一种基于增加神经网络深度的持续学习新方法。本研究探索了在终身学习场景中扩展神经网络深度是否具有优势。方法：我们提出一种基于在现有层之上添加新层以实现知识前向迁移和调整先前学习表示的新方法。我们采用一种确定最相似任务的方法，以在神经网络中选择添加具有可训练参数新节点的最佳位置。该方法能够生成树状模型，其中每个节点代表一组专门用于特定任务的神经网络参数。本方法受渐进式神经网络概念的启发，因此能受益于网络结构的动态变化。然而，渐进式神经网络在学习过程中需要为整个网络结构分配大量内存。所提方法通过仅为新任务添加部分网络并利用先前训练权重的子集来缓解这一问题。同时，我们保留了渐进式神经网络的优势，例如通过设计保证无遗忘，且无需使用记忆缓冲区。结果：在Split CIFAR和Split Tiny ImageNet上的实验表明，所提算法与其他持续学习方法性能相当。在更复杂的设置中，当单个计算机视觉数据集作为独立任务时，我们的方法优于经验回放。结论：该方法与常用的计算机视觉架构兼容，无需自定义网络结构。由于通过扩展架构来适应数据分布的变化，因此无需使用重放缓冲区。因此，我们的方法可应用于必须考虑数据隐私的敏感场景。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日