Existing knowledge distillation methods generally use a teacher-student approach, where the student network solely learns from a well-trained teacher. However, this approach overlooks the inherent differences in learning abilities between the teacher and student networks, thus causing the capacity-gap problem. To address this limitation, we propose a novel method called SLKD.
翻译:现有知识蒸馏方法通常采用教师-学生范式,其中学生网络仅从训练完备的教师网络中学习。然而,这种方法忽略了教师网络与学生网络之间固有的学习能力差异,由此引发了能力差距问题。针对这一局限,我们提出了一种名为SLKD的新方法。