Dataset distillation (DD) aims to minimize the time and memory consumption needed for training deep neural networks on large datasets, by creating a smaller synthetic dataset that has similar performance to that of the full real dataset. However, current dataset distillation methods often result in synthetic datasets that are excessively difficult for networks to learn from, due to the compression of a substantial amount of information from the original data through metrics measuring feature similarity, e,g., distribution matching (DM). In this work, we introduce conditional mutual information (CMI) to assess the class-aware complexity of a dataset and propose a novel method by minimizing CMI. Specifically, we minimize the distillation loss while constraining the class-aware complexity of the synthetic dataset by minimizing its empirical CMI from the feature space of pre-trained networks, simultaneously. Conducting on a thorough set of experiments, we show that our method can serve as a general regularization method to existing DD methods and improve the performance and training efficiency.
翻译:数据集蒸馏(DD)旨在通过创建性能与完整真实数据集相近的较小合成数据集,来最小化在大型数据集上训练深度神经网络所需的时间和内存消耗。然而,当前的数据集蒸馏方法通常会导致合成数据集对网络学习而言过于困难,这是因为通过度量特征相似性的指标(例如分布匹配(DM))压缩了原始数据中的大量信息。在本工作中,我们引入条件互信息(CMI)来评估数据集的类感知复杂度,并提出一种通过最小化CMI的新方法。具体而言,我们在最小化蒸馏损失的同时,通过同时最小化合成数据集从预训练网络特征空间的经验CMI,来约束合成数据集的类感知复杂度。通过一系列全面的实验,我们表明本方法可作为现有DD方法的通用正则化方法,并提升其性能与训练效率。