Given a real-world dataset, data condensation (DC) aims to synthesize a significantly smaller dataset that captures the knowledge of this dataset for model training with high performance. Recent works propose to enhance DC with data parameterization, which condenses data into parameterized data containers rather than pixel space. The intuition behind data parameterization is to encode shared features of images to avoid additional storage costs. In this paper, we recognize that images share common features in a hierarchical way due to the inherent hierarchical structure of the classification system, which is overlooked by current data parameterization methods. To better align DC with this hierarchical nature and encourage more efficient information sharing inside data containers, we propose a novel data parameterization architecture, Hierarchical Memory Network (HMN). HMN stores condensed data in a three-tier structure, representing the dataset-level, class-level, and instance-level features. Another helpful property of the hierarchical architecture is that HMN naturally ensures good independence among images despite achieving information sharing. This enables instance-level pruning for HMN to reduce redundant information, thereby further minimizing redundancy and enhancing performance. We evaluate HMN on four public datasets (SVHN, CIFAR10, CIFAR100, and Tiny-ImageNet) and compare HMN with eight DC baselines. The evaluation results show that our proposed method outperforms all baselines, even when trained with a batch-based loss consuming less GPU memory.
翻译:给定一个真实世界数据集,数据精简旨在合成一个显著更小的数据集,以捕获原始数据集的知识,从而在模型训练中实现高性能。近期研究提出通过数据参数化增强数据精简,该方法将数据压缩至参数化数据容器而非像素空间。数据参数化的核心理念是对图像的共享特征进行编码,以避免额外的存储成本。本文发现,由于分类系统固有的层次化结构,图像以层次化方式共享共同特征,而当前数据参数化方法忽视了这一特性。为更好地使数据精简与这种层次化特性对齐,并促进数据容器内部更高效的信息共享,我们提出一种新型数据参数化架构——层次化记忆网络(HMN)。HMN以三层结构存储精简数据,分别表示数据集级、类别级和实例级特征。层次化架构的另一有益特性在于,它在实现信息共享的同时自然确保了图像间的良好独立性。这使得HMN能够进行实例级剪枝以去除冗余信息,从而进一步降低冗余度并提升性能。我们在四个公开数据集(SVHN、CIFAR10、CIFAR100和Tiny-ImageNet)上评估HMN,并与八种数据精简基线方法进行对比。评估结果表明,即使采用消耗更少GPU内存的批量损失函数进行训练,我们的方法仍优于所有基线方法。