Data-free knowledge distillation is a challenging model lightweight task for scenarios in which the original dataset is not available. Previous methods require a lot of extra computational costs to update one or more generators and their naive imitate-learning lead to lower distillation efficiency. Based on these observations, we first propose an efficient unlabeled sample selection method to replace high computational generators and focus on improving the training efficiency of the selected samples. Then, a class-dropping mechanism is designed to suppress the label noise caused by the data domain shifts. Finally, we propose a distillation method that incorporates explicit features and implicit structured relations to improve the effect of distillation. Experimental results show that our method can quickly converge and obtain higher accuracy than other state-of-the-art methods.
翻译:无数据知识蒸馏是一种针对原始数据集不可用场景下具有挑战性的模型轻量化任务。现有方法需要大量额外计算成本来更新一个或多个生成器,且其简单的模仿学习导致蒸馏效率较低。基于这些观察,我们首先提出一种高效的无标签样本选择方法以替代高计算成本的生成器,并专注于提升所选样本的训练效率。随后,设计了一种类别丢弃机制用于抑制由数据域偏移引起的标签噪声。最后,提出了一种融合显式特征与隐式结构关系的蒸馏方法,以提升蒸馏效果。实验结果表明,我们的方法能够快速收敛,并取得比现有最先进方法更高的准确率。