Continual learning, the ability to acquire knowledge from new data while retaining previously learned information, is a fundamental challenge in machine learning. Various approaches, including memory replay, knowledge distillation, model regularization, and dynamic network expansion, have been proposed to address this issue. Thus far, dynamic network expansion methods have achieved state-of-the-art performance at the cost of incurring significant computational overhead. This is due to the need for additional model buffers, which makes it less feasible in resource-constrained settings, particularly in the medical domain. To overcome this challenge, we propose Dynamic Model Merging, DynaMMo, a method that merges multiple networks at different stages of model training to achieve better computational efficiency. Specifically, we employ lightweight learnable modules for each task and combine them into a unified model to minimize computational overhead. DynaMMo achieves this without compromising performance, offering a cost-effective solution for continual learning in medical applications. We evaluate DynaMMo on three publicly available datasets, demonstrating its effectiveness compared to existing approaches. DynaMMo offers around 10-fold reduction in GFLOPS with a small drop of 2.76 in average accuracy when compared to state-of-the-art dynamic-based approaches. The code implementation of this work will be available upon the acceptance of this work at https://github.com/BioMedIA-MBZUAI/DynaMMo.
翻译:持续学习——即从新数据中获取知识的同时保留先前所学信息的能力——是机器学习领域的一项基本挑战。研究者已提出多种方法应对该问题,包括记忆回放、知识蒸馏、模型正则化及动态网络扩展等。迄今为止,动态网络扩展方法虽取得了最先进的性能,但代价是引入显著的计算开销,这源于对额外模型缓冲区的需求,使其在资源受限场景(尤其是医疗领域)中可行性降低。为克服这一挑战,我们提出动态模型合并方法DynaMMo,该方法通过在不同模型训练阶段合并多个网络,实现了更高的计算效率。具体而言,我们为每个任务采用轻量级可学习模块,并将其整合为统一模型以最小化计算开销。DynaMMo在保证性能不受损的前提下实现上述目标,为医疗应用的持续学习提供了经济高效的解决方案。我们在三个公开数据集上评估了DynaMMo,证明了其相较现有方法的有效性。与基于动态的最先进方法相比,DynaMMo在平均精度仅下降2.76的情况下,实现了约10倍的GFLOPS减少。本工作的代码实现将在论文接收后于https://github.com/BioMedIA-MBZUAI/DynaMMo公开。