Knowledge Distillation (KD) is a powerful technique for transferring knowledge between neural network models, where a pre-trained teacher model is used to facilitate the training of the target student model. However, the availability of a suitable teacher model is not always guaranteed. To address this challenge, Self-Knowledge Distillation (SKD) attempts to construct a teacher model from itself. Existing SKD methods add Auxiliary Classifiers (AC) to intermediate layers of the model or use the history models and models with different input data within the same class. However, these methods are computationally expensive and only capture time-wise and class-wise features of data. In this paper, we propose a lightweight SKD framework that utilizes multi-source information to construct a more informative teacher. Specifically, we introduce a Distillation with Reverse Guidance (DRG) method that considers different levels of information extracted by the model, including edge, shape, and detail of the input data, to construct a more informative teacher. Additionally, we design a Distillation with Shape-wise Regularization (DSR) method that ensures a consistent shape of ranked model output for all data. We validate the performance of the proposed DRG, DSR, and their combination through comprehensive experiments on various datasets and models. Our results demonstrate the superiority of the proposed methods over baselines (up to 2.87%) and state-of-the-art SKD methods (up to 1.15%), while being computationally efficient and robust. The code is available at https://github.com/xucong-parsifal/LightSKD.
翻译:知识蒸馏(KD)是一种在神经网络模型间迁移知识的强大技术,它利用预训练的教师模型来辅助目标学生模型的训练。然而,合适的教师模型并不总是存在。为解决这一挑战,自知识蒸馏(SKD)尝试从模型自身构建教师模型。现有的SKD方法在模型中间层添加辅助分类器(AC),或利用历史模型及同类中具有不同输入数据的模型,但这些方法计算开销大,且仅能捕捉数据的时间维度和类别维度特征。本文提出一种轻量级SKD框架,通过多源信息构建信息更丰富的教师。具体而言,我们提出了一种反向引导蒸馏(DRG)方法,该方法考虑模型提取的不同层次信息(包括输入数据的边缘、形状和细节)以构建信息更丰富的教师。此外,我们设计了一种基于形状正则化的蒸馏(DSR)方法,确保所有数据排序后的模型输出形状一致。我们在多种数据集和模型上通过全面实验验证了所提出的DRG、DSR及其组合的性能。结果表明,所提方法在计算高效且鲁棒的同时,优于基线方法(最高提升2.87%)和当前最优的SKD方法(最高提升1.15%)。代码已开源在:https://github.com/xucong-parsifal/LightSKD。