Knowledge distillation (KD) represents a vital mechanism to transfer expertise from complex teacher networks to efficient student models. However, in decentralized or secure AI ecosystems, privacy regulations and proprietary interests often restrict access to the teacher's interface and original datasets. These constraints define a challenging black-box data-free KD scenario where only top-1 predictions and no training data are available. While recent approaches utilize synthetic data, they still face limitations in data diversity and distillation signals. We propose Diverse Image Priors Knowledge Distillation (DIP-KD), a framework that addresses these challenges through a three-phase collaborative pipeline: (1) Synthesis of image priors to capture diverse visual patterns and semantics; (2) Contrast to enhance the collective distinction between synthetic samples via contrastive learning; and (3) Distillation via a novel primer student that enables soft-probability KD. Our evaluation across 12 benchmarks shows that DIP-KD achieves state-of-the-art performance, with ablations confirming data diversity as critical for knowledge acquisition in restricted AI environments.
翻译:知识蒸馏(KD)是从复杂教师网络向高效学生模型传递专业知识的关键机制。然而,在去中心化或安全AI生态系统中,隐私法规与商业利益往往限制对教师接口及原始数据集的访问。这些约束定义了一个具有挑战性的黑箱无数据知识蒸馏场景:仅能获取最高类别预测结果,且无训练数据可用。尽管近期方法采用合成数据,但仍面临数据多样性与蒸馏信号不足的局限。本文提出多样化图像先验知识蒸馏(DIP-KD)框架,通过三阶段协作流程解决上述挑战:(1)合成图像先验以捕获多样化视觉模式与语义;(2)通过对比学习增强合成样本间集体区分度的对比机制;(3)借助新型预备学生模型实现软概率蒸馏的新型蒸馏方法。在12个基准测试上的评估表明,DIP-KD达到了最优性能,消融实验证实数据多样性是在受限AI环境中获取知识的关键要素。