Dataset distillation (DD) entails creating a refined, compact distilled dataset from a large-scale dataset to facilitate efficient training. A significant challenge in DD is the dependency between the distilled dataset and the neural network (NN) architecture used. Training a different NN architecture with a distilled dataset distilled using a specific architecture often results in diminished trainning performance for other architectures. This paper introduces MetaDD, designed to enhance the generalizability of DD across various NN architectures. Specifically, MetaDD partitions distilled data into meta features (i.e., the data's common characteristics that remain consistent across different NN architectures) and heterogeneous features (i.e., the data's unique feature to each NN architecture). Then, MetaDD employs an architecture-invariant loss function for multi-architecture feature alignment, which increases meta features and reduces heterogeneous features in distilled data. As a low-memory consumption component, MetaDD can be seamlessly integrated into any DD methodology. Experimental results demonstrate that MetaDD significantly improves performance across various DD methods. On the Distilled Tiny-Imagenet with Sre2L (50 IPC), MetaDD achieves cross-architecture NN accuracy of up to 30.1\%, surpassing the second-best method (GLaD) by 1.7\%.
翻译:数据集蒸馏(DD)旨在从大规模数据集中创建精炼、紧凑的蒸馏数据集,以实现高效训练。DD面临的一个关键挑战在于蒸馏数据集与所用神经网络(NN)架构之间的依赖性。使用特定架构蒸馏得到的数据集训练不同NN架构时,常导致其他架构的训练性能下降。本文提出MetaDD,旨在提升DD在不同NN架构间的泛化能力。具体而言,MetaDD将蒸馏数据划分为元特征(即数据在不同NN架构间保持一致的共性特征)和异构特征(即数据针对各NN架构的独有特征)。随后,MetaDD采用架构不变损失函数进行多架构特征对齐,从而增强蒸馏数据中的元特征并抑制异构特征。作为低内存消耗组件,MetaDD可无缝集成至任意DD方法中。实验结果表明,MetaDD显著提升了多种DD方法的性能。在Sre2L(50 IPC)的Distilled Tiny-Imagenet数据集上,MetaDD实现了高达30.1%的跨架构NN准确率,较次优方法(GLaD)提升1.7%。