Training state-of-the-art (SOTA) deep models often requires extensive data, resulting in substantial training and storage costs. To address these challenges, dataset condensation has been developed to learn a small synthetic set that preserves essential information from the original large-scale dataset. Nowadays, optimization-oriented methods have been the primary method in the field of dataset condensation for achieving SOTA results. However, the bi-level optimization process hinders the practical application of such methods to realistic and larger datasets. To enhance condensation efficiency, previous works proposed Distribution-Matching (DM) as an alternative, which significantly reduces the condensation cost. Nonetheless, current DM-based methods have yielded less comparable results to optimization-oriented methods due to their focus on aligning only the first moment of the distributions. In this paper, we present a novel DM-based method named M3D for dataset condensation by Minimizing the Maximum Mean Discrepancy between feature representations of the synthetic and real images. By embedding their distributions in a reproducing kernel Hilbert space, we align all orders of moments of the distributions of real and synthetic images, resulting in a more generalized condensed set. Notably, our method even surpasses the SOTA optimization-oriented method IDC on the high-resolution ImageNet dataset. Extensive analysis is conducted to verify the effectiveness of the proposed method.
翻译:训练最先进(SOTA)深度模型通常需要大量数据,导致训练和存储成本高昂。为解决这些挑战,数据集压缩技术被提出,旨在学习一个能够保留原始大规模数据集核心信息的小型合成集。当前,优化导向方法已成为实现SOTA结果的主要手段。然而,双层优化过程阻碍了此类方法在实际大规模数据集上的应用。为提升压缩效率,先前研究提出了分布匹配(DM)作为替代方案,显著降低了压缩成本。但现有基于DM的方法因仅对齐分布的一阶矩,其效果与优化导向方法相比仍存在差距。本文提出一种名为M3D的新型基于DM的数据集压缩方法,通过最小化合成图像与真实图像特征表示间的最大均值差异,将二者分布嵌入再生核希尔伯特空间,对齐真实与合成图像分布的所有阶矩,从而获得更具泛化性的压缩集。值得注意的是,该方法在ImageNet高分辨率数据集上甚至超越了SOTA优化导向方法IDC。通过大量分析验证了所提方法的有效性。