In recent years, deep neural networks have achieved remarkable accuracy in computer vision tasks. With inference time being a crucial factor, particularly in dense prediction tasks such as semantic segmentation, knowledge distillation has emerged as a successful technique for improving the accuracy of lightweight student networks. The existing methods often neglect the information in channels and among different classes. To overcome these limitations, this paper proposes a novel method called Inter-Class Similarity Distillation (ICSD) for the purpose of knowledge distillation. The proposed method transfers high-order relations from the teacher network to the student network by independently computing intra-class distributions for each class from network outputs. This is followed by calculating inter-class similarity matrices for distillation using KL divergence between distributions of each pair of classes. To further improve the effectiveness of the proposed method, an Adaptive Loss Weighting (ALW) training strategy is proposed. Unlike existing methods, the ALW strategy gradually reduces the influence of the teacher network towards the end of training process to account for errors in teacher's predictions. Extensive experiments conducted on two well-known datasets for semantic segmentation, Cityscapes and Pascal VOC 2012, validate the effectiveness of the proposed method in terms of mIoU and pixel accuracy. The proposed method outperforms most of existing knowledge distillation methods as demonstrated by both quantitative and qualitative evaluations. Code is available at: https://github.com/AmirMansurian/AICSD
翻译:近年来,深度神经网络在计算机视觉任务中取得了显著的精度。考虑到推理时间是一个关键因素,特别是在语义分割等密集预测任务中,知识蒸馏已成为提升轻量级学生网络精度的有效技术。现有方法往往忽视通道间及不同类别间的信息。为克服这些局限,本文提出了一种名为类间相似性蒸馏(ICSD)的新颖知识蒸馏方法。该方法通过从网络输出中独立计算每个类别的类内分布,将高阶关系从教师网络传递至学生网络;随后,利用每对类别分布之间的KL散度计算类间相似性矩阵以进行蒸馏。为进一步提升所提方法的有效性,本文提出了一种自适应损失权重(ALW)训练策略。与现有方法不同,ALW策略会在训练过程后期逐步降低教师网络的影响,以应对教师预测中的误差。在语义分割领域两个知名数据集(Cityscapes和Pascal VOC 2012)上开展的大量实验,验证了所提方法在平均交并比(mIoU)和像素精度方面的有效性。定量与定性评估均表明,所提方法优于大多数现有知识蒸馏方法。代码开源地址:https://github.com/AmirMansurian/AICSD