The performance of convolutional neural networks (CNN) depends heavily on their architectures. Transfer learning performance of a CNN relies quite strongly on selection of its trainable layers. Selecting the most effective update layers for a certain target dataset often requires expert knowledge on CNN architecture which many practitioners do not posses. General users prefer to use an available architecture (e.g. GoogleNet, ResNet, EfficientNet etc.) that is developed by domain experts. With the ever-growing number of layers, it is increasingly becoming quite difficult and cumbersome to handpick the update layers. Therefore, in this paper we explore the application of genetic algorithm to mitigate this problem. The convolutional layers of popular pretrained networks are often grouped into modules that constitute their building blocks. We devise a genetic algorithm to select blocks of layers for updating the parameters. By experimenting with EfficientNetB0 pre-trained on ImageNet and using Food-101, CIFAR-100 and MangoLeafBD as target datasets, we show that our algorithm yields similar or better results than the baseline in terms of accuracy, and requires lower training and evaluation time due to learning less number of parameters. We also devise a metric called block importance to measure efficacy of each block as update block and analyze the importance of the blocks selected by our algorithm.
翻译:卷积神经网络(CNN)的性能高度依赖于其架构。CNN的迁移学习性能在很大程度上取决于其可训练层的选择。为特定目标数据集选择最有效的更新层通常需要具备CNN架构的专家知识,而许多实践者并不具备这种知识。普通用户倾向于使用领域专家开发的现有架构(例如GoogleNet、ResNet、EfficientNet等)。随着网络层数不断增加,手动挑选更新层变得日益困难且繁琐。因此,本文探索应用遗传算法来缓解这一问题。流行的预训练网络的卷积层通常被分组为构成其构建模块的单元。我们设计了一种遗传算法来选择用于更新参数的层模块。通过在ImageNet上预训练的EfficientNetB0模型上开展实验,并以Food-101、CIFAR-100和MangoLeafBD作为目标数据集,我们证明该算法在准确性方面与基线模型相媲美或更优,同时由于学习更少的参数,所需训练和评估时间更短。我们还设计了一种称为模块重要性的指标来衡量每个模块作为更新模块的有效性,并分析了算法所选模块的重要性。