Recently, CNN-based SISR has numerous parameters and high computational cost to achieve better performance, limiting its applicability to resource-constrained devices such as mobile. As one of the methods to make the network efficient, Knowledge Distillation (KD), which transfers teacher's useful knowledge to student, is currently being studied. More recently, KD for SISR utilizes Feature Distillation (FD) to minimize the Euclidean distance loss of feature maps between teacher and student networks, but it does not sufficiently consider how to effectively and meaningfully deliver knowledge from teacher to improve the student performance at given network capacity constraints. In this paper, we propose a feature-domain adaptive contrastive distillation (FACD) method for efficiently training lightweight student SISR networks. We show the limitations of the existing FD methods using Euclidean distance loss, and propose a feature-domain contrastive loss that makes a student network learn richer information from the teacher's representation in the feature domain. In addition, we propose an adaptive distillation that selectively applies distillation depending on the conditions of the training patches. The experimental results show that the student EDSR and RCAN networks with the proposed FACD scheme improves not only the PSNR performance of the entire benchmark datasets and scales, but also the subjective image quality compared to the conventional FD approaches.
翻译:近年来,基于CNN的SISR方法为提升性能而引入了大量参数和高计算开销,限制了其在移动等资源受限设备上的应用。知识蒸馏(KD)通过将教师网络的有用知识迁移至学生网络,成为提升网络效率的有效方法之一。针对SISR的近期研究采用特征蒸馏(FD)来最小化教师与学生网络间特征图的欧氏距离损失,但未充分考虑如何在给定网络容量约束下,有效且有意义地传递教师知识以提升学生性能。本文提出一种特征域自适应对比蒸馏(FACD)方法,用于高效训练轻量级SISR学生网络。我们揭示了现有基于欧氏距离损失的FD方法的局限性,并提出一种特征域对比损失,使学生网络能从教师网络在特征域的表示中学习更丰富的信息。此外,我们提出自适应蒸馏策略,根据训练块的条件选择性应用蒸馏。实验结果表明,采用所提出的FACD方案的学生EDSR和RCAN网络,不仅在全基准数据集和尺度上提升了PSNR性能,相较于传统FD方法还改善了主观图像质量。