Data-free knowledge distillation (DFKD) aims to obtain a lightweight student model without original training data. Existing works generally synthesize data from the pre-trained teacher model to replace the original training data for student learning. To more effectively train the student model, the synthetic data shall be customized to the current student learning ability. However, this is ignored in the existing DFKD methods and thus negatively affects the student training. To address this issue, we propose Customizing Synthetic Data for Data-Free Student Learning (CSD) in this paper, which achieves adaptive data synthesis using a self-supervised augmented auxiliary task to estimate the student learning ability. Specifically, data synthesis is dynamically adjusted to enlarge the cross entropy between the labels and the predictions from the self-supervised augmented task, thus generating hard samples for the student model. The experiments on various datasets and teacher-student models show the effectiveness of our proposed method. Code is available at: $\href{https://github.com/luoshiya/CSD}{https://github.com/luoshiya/CSD}$
翻译:无数据知识蒸馏(DFKD)旨在无需原始训练数据即可获得轻量级学生模型。现有方法通常从预训练教师模型中合成数据,以替代原始训练数据供学生学习。为更有效地训练学生模型,合成数据应针对当前学生的学习能力进行定制化处理。然而,现有DFKD方法忽略了这一需求,从而对学生训练产生负面影响。针对该问题,本文提出面向无数据学生学习的合成数据定制化方法(CSD),通过自监督增强辅助任务估计学生学习能力,实现自适应数据合成。具体地,动态调整数据合成过程以扩大自监督增强任务中标签与预测值之间的交叉熵,从而为学生模型生成困难样本。在多种数据集及教师-学生模型上的实验验证了所提方法的有效性。代码开源地址:$\href{https://github.com/luoshiya/CSD}{https://github.com/luoshiya/CSD}$