With the widespread success of deep learning technologies, many trained deep neural network (DNN) models are now publicly available. However, directly reusing the public DNN models for new tasks often fails due to mismatching functionality or performance. Inspired by the notion of modularization and composition in software reuse, we investigate the possibility of improving the reusability of DNN models in a more fine-grained manner. Specifically, we propose two modularization approaches named CNNSplitter and GradSplitter, which can decompose a trained convolutional neural network (CNN) model for $N$-class classification into $N$ small reusable modules. Each module recognizes one of the $N$ classes and contains a part of the convolution kernels of the trained CNN model. Then, the resulting modules can be reused to patch existing CNN models or build new CNN models through composition. The main difference between CNNSplitter and GradSplitter lies in their search methods: the former relies on a genetic algorithm to explore search space, while the latter utilizes a gradient-based search method. Our experiments with three representative CNNs on three widely-used public datasets demonstrate the effectiveness of the proposed approaches. Compared with CNNSplitter, GradSplitter incurs less accuracy loss, produces much smaller modules (19.88% fewer kernels), and achieves better results on patching weak models. In particular, experiments on GradSplitter show that (1) by patching weak models, the average improvement in terms of precision, recall, and F1-score is 17.13%, 4.95%, and 11.47%, respectively, and (2) for a new task, compared with the models trained from scratch, reusing modules achieves similar accuracy (the average loss of accuracy is only 2.46%) without a costly training process. Our approaches provide a viable solution to the rapid development and improvement of CNN models.
翻译:随着深度学习技术的广泛成功,大量训练好的深度神经网络(DNN)模型现已公开可用。然而,直接将这些公开的DNN模型用于新任务时,常因功能或性能不匹配而失败。受软件重用中模块化与组合概念的启发,我们探究了以更细粒度方式提升DNN模型可重用性的可能性。具体而言,我们提出两种模块化方法——CNNSplitter和GradSplitter,它们可将已训练的用于N类分类的卷积神经网络(CNN)模型分解为N个可重用的小型模块。每个模块既能识别N个类别之一,又包含已训练CNN模型的部分卷积核。然后,这些所得模块可通过组合方式修补现有CNN模型或构建新CNN模型。CNNSplitter与GradSplitter的主要区别在于搜索方法:前者依赖遗传算法探索搜索空间,而后者采用基于梯度的搜索方法。我们在三个代表性CNN模型及三个广泛使用的公开数据集上的实验表明,所提方法具有有效性。相较于CNNSplitter,GradSplitter在精度损失更小的前提下,生成更小的模块(核数量减少19.88%),并在修补弱模型方面取得更优效果。特别地,GradSplitter实验显示:(1)通过修补弱模型,模型在精确率、召回率和F1分数上平均提升分别为17.13%、4.95%和11.47%;(2)对于新任务,相较于从零训练的模型,重用模块可达到相近精度(精度平均仅损失2.46%),且无需昂贵训练过程。我们的方法为CNN模型的快速开发与性能提升提供了可行方案。