Hierarchical Features Matter: A Deep Exploration of GAN Priors for Improved Dataset Distillation

Dataset distillation is an emerging dataset reduction method, which condenses large-scale datasets while maintaining task accuracy. Current methods have integrated parameterization techniques to boost synthetic dataset performance by shifting the optimization space from pixel to another informative feature domain. However, they limit themselves to a fixed optimization space for distillation, neglecting the diverse guidance across different informative latent spaces. To overcome this limitation, we propose a novel parameterization method dubbed Hierarchical Generative Latent Distillation (H-GLaD), to systematically explore hierarchical layers within the generative adversarial networks (GANs). This allows us to progressively span from the initial latent space to the final pixel space. In addition, we introduce a novel class-relevant feature distance metric to alleviate the computational burden associated with synthetic dataset evaluation, bridging the gap between synthetic and original datasets. Experimental results demonstrate that the proposed H-GLaD achieves a significant improvement in both same-architecture and cross-architecture performance with equivalent time consumption.

翻译：数据集蒸馏是一种新兴的数据集压缩方法，它能在保持任务精度的同时压缩大规模数据集。现有方法通过将优化空间从像素域转换到其他信息丰富的特征域，结合参数化技术来提升合成数据集的性能。然而，这些方法局限于在固定的优化空间中进行蒸馏，忽略了不同信息潜在空间之间的多样化指导。为克服这一局限，我们提出了一种名为层次化生成潜在蒸馏（H-GLaD）的新型参数化方法，以系统性地探索生成对抗网络（GANs）中的层次化层。这使得我们能够从初始潜在空间逐步扩展到最终像素空间。此外，我们引入了一种新颖的类相关特征距离度量，以减轻合成数据集评估相关的计算负担，从而弥合合成数据集与原始数据集之间的差距。实验结果表明，所提出的H-GLaD在相同时间消耗下，于同架构和跨架构性能上均取得了显著提升。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日