Modular approaches, which use a different composition of modules for each problem and avoid forgetting by design, have been shown to be a promising direction in continual learning (CL). However, searching through the large, discrete space of possible module compositions is a challenge because evaluating a composition's performance requires a round of neural network training. To address this challenge, we develop a modular CL framework, called PICLE, that accelerates search by using a probabilistic model to cheaply compute the fitness of each composition. The model combines prior knowledge about good module compositions with dataset-specific information. Its use is complemented by splitting up the search space into subsets, such as perceptual and latent subsets. We show that PICLE is the first modular CL algorithm to achieve different types of transfer while scaling to large search spaces. We evaluate it on two benchmark suites designed to capture different desiderata of CL techniques. On these benchmarks, PICLE offers significantly better performance than state-of-the-art CL baselines.
翻译:模块化方法通过为每个问题使用不同的模块组合,并从根本上避免遗忘,已被证明是持续学习(CL)中的一个有前景的方向。然而,由于评估一种组合的性能需要经过一轮神经网络训练,在巨大的离散模块组合空间中进行搜索具有挑战性。为解决这一挑战,我们开发了一种名为PICLE的模块化持续学习框架,该框架利用概率模型廉价地计算每种组合的适应度,从而加速搜索过程。该模型将关于良好模块组合的先验知识与数据集特定信息相结合。其应用通过将搜索空间划分为感知子集和潜在子集等子集来补充。我们证明,PICLE是首个在扩展到大型搜索空间的同时实现不同类型知识迁移的模块化持续学习算法。我们在两个旨在捕捉CL技术不同设计需求的基准测试套件上对其进行评估。在这些基准测试中,PICLE的性能显著优于现有的最优CL基线方法。