Scaling Up Dataset Distillation to ImageNet-1K with Constant Memory - 专知论文

会员服务 ·

0

Performer · 缩放 · 数据集 · 蒸馏 · SOFT ·

2023 年 5 月 31 日

Scaling Up Dataset Distillation to ImageNet-1K with Constant Memory

翻译：扩展数据集蒸馏至恒定内存下的ImageNet-1K

Justin Cui,Ruochen Wang,Si Si,Cho-Jui Hsieh

from arxiv, ICLR 2023 submission link: https://openreview.net/forum?id=dN70O8pmW8

Dataset distillation methods aim to compress a large dataset into a small set of synthetic samples, such that when being trained on, competitive performances can be achieved compared to regular training on the entire dataset. Among recently proposed methods, Matching Training Trajectories (MTT) achieves state-of-the-art performance on CIFAR-10/100, while having difficulty scaling to ImageNet-1k dataset due to the large memory requirement when performing unrolled gradient computation through back-propagation. Surprisingly, we show that there exists a procedure to exactly calculate the gradient of the trajectory matching loss with constant GPU memory requirement (irrelevant to the number of unrolled steps). With this finding, the proposed memory-efficient trajectory matching method can easily scale to ImageNet-1K with 6x memory reduction while introducing only around 2% runtime overhead than original MTT. Further, we find that assigning soft labels for synthetic images is crucial for the performance when scaling to larger number of categories (e.g., 1,000) and propose a novel soft label version of trajectory matching that facilities better aligning of model training trajectories on large datasets. The proposed algorithm not only surpasses previous SOTA on ImageNet-1K under extremely low IPCs (Images Per Class), but also for the first time enables us to scale up to 50 IPCs on ImageNet-1K. Our method (TESLA) achieves 27.9% testing accuracy, a remarkable +18.2% margin over prior arts.

翻译：数据集蒸馏方法旨在将大规模数据集压缩为少量合成样本，使得在这些样本上训练后，能够获得与完整数据集常规训练相媲美的性能。在近期提出的方法中，匹配训练轨迹（MTT）在CIFAR-10/100上达到了最先进水平，但由于通过反向传播进行展开梯度计算时所需的大内存，难以扩展至ImageNet-1K数据集。令人惊讶的是，我们发现存在一种方法，能够在恒定GPU内存需求（与展开步数无关）下精确计算轨迹匹配损失的梯度。基于这一发现，所提出的内存高效轨迹匹配方法可轻松扩展到ImageNet-1K，实现6倍内存缩减，同时仅引入约2%的运行时开销（相较于原始MTT）。此外，我们发现为合成图像分配软标签对于扩展到更大类别数量（如1000类）时的性能至关重要，并提出了一种新颖的软标签版本轨迹匹配方法，该方法能更好地在大型数据集上对齐模型训练轨迹。所提出的算法不仅在极低IPC（每类图像数）条件下超越了ImageNet-1K上的先前SOTA，而且首次使我们能够将IPC扩展到50。我们的方法（TESLA）达到了27.9%的测试准确率，较先前技术实现了+18.2%的显著提升。

0

相关内容

Performer

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

46+阅读 · 2020年10月31日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

专知

19+阅读 · 2018年5月31日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

Sestrin2/AMPK信号通路调控新生鼠缺氧缺血脑损伤细胞自噬的新机制

国家自然科学基金

0+阅读 · 2015年12月31日

函数数据变换模型及降维方法的研究

国家自然科学基金

1+阅读 · 2015年12月31日

等离子体激励器表面电荷沉积机理与激励器优化设计

国家自然科学基金

0+阅读 · 2014年12月31日

高密度三维封装TSV电迁移可靠性机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

肾小球内皮细胞表达的swiprosin-1在早期糖尿病肾病发病中的作用及机制研究

国家自然科学基金

1+阅读 · 2014年12月31日

Notch3在非酒精性脂肪性肝病SREBPs信号通路中作用机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

氯化胆碱-尿素离子液体中电沉积高耐蚀Ni/Ni-Zn纳米多层膜的研究

国家自然科学基金

0+阅读 · 2013年12月31日

(Mn,Co)3O4阴极的晶面择优取向效应及氧还原反应动力学研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向三维高密度封装的Cu6Sn5单晶UBM的定向合成原理与可靠性研究

国家自然科学基金

0+阅读 · 2013年12月31日

近日节律与代谢疾病

国家自然科学基金

0+阅读 · 2011年12月31日

Reverse Knowledge Distillation: Training a Large Model using a Small One for Retinal Image Matching on Limited Data

Arxiv

0+阅读 · 2023年7月21日

Near-Linear Time Projection onto the $\ell_{1,\infty}$ Ball; Application to Sparse Autoencoders

Arxiv

0+阅读 · 2023年7月19日

Pseudo Outlier Exposure for Out-of-Distribution Detection using Pretrained Transformers

Pseudo Outlier Exposure for Out-of-Distribution Detection using Pretrained Transformers

Arxiv

0+阅读 · 2023年7月19日

Distilling Large Vision-Language Model with Out-of-Distribution Generalizability

Arxiv

0+阅读 · 2023年7月19日

GraphMoco:a Graph Momentum Contrast Model that Using Multimodel Structure Information for Large-scale Binary Function Representation Learning

GraphMoco:a Graph Momentum Contrast Model that Using Multimodel Structure Information for Large-scale Binary Function Representation Learning

Arxiv

0+阅读 · 2023年7月18日

MetAug: Contrastive Learning via Meta Feature Augmentation

Arxiv

10+阅读 · 2022年3月10日

Spatially Consistent Representation Learning

Arxiv

14+阅读 · 2021年3月10日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

Convolutional 2D Knowledge Graph Embeddings

Arxiv

29+阅读 · 2018年4月6日

VIP会员

文章信息

相关主题

最新内容

ICML 2026 | VOTP：用视频基础模型与最优传输，让离线偏好强化学习只需少量反馈

ICML 2026 | VOTP：用视频基础模型与最优传输，让离线偏好强化学习只需少量反馈

专知会员服务

1+阅读 · 今天13:56

多模态代码智能综述：从视觉输入到可执行代码系统

多模态代码智能综述：从视觉输入到可执行代码系统

专知会员服务

1+阅读 · 今天13:54

美国马六甲“三重网”概念：安全网、威慑网与杀伤网

美国马六甲“三重网”概念：安全网、威慑网与杀伤网

专知会员服务

3+阅读 · 今天8:18

《面向导弹有效发射时机的监督机器学习方法：基于超视距空战仿真》

《面向导弹有效发射时机的监督机器学习方法：基于超视距空战仿真》

专知会员服务

3+阅读 · 今天7:39

《通用大语言模型：无人机指挥与控制接口》最新40页

《通用大语言模型：无人机指挥与控制接口》最新40页

专知会员服务

9+阅读 · 今天7:33

《通过小型无人机系统将情报能力“作战化”》

《通过小型无人机系统将情报能力“作战化”》

专知会员服务

3+阅读 · 今天7:28

《神经安全型有人–无人协同：面向认知自适应作战能力的参考架构》

《神经安全型有人–无人协同：面向认知自适应作战能力的参考架构》

专知会员服务

6+阅读 · 今天7:14

《在指挥链中通过多准则决策分析传达指挥官意图：空战实验》

《在指挥链中通过多准则决策分析传达指挥官意图：空战实验》

专知会员服务

18+阅读 · 6月15日

消耗优势：美军的“精确规模化”概念

消耗优势：美军的“精确规模化”概念

专知会员服务

7+阅读 · 6月15日

五角大楼的AI优先战略及其对现代战争的启示：来自与伊朗冲突的经验教训

五角大楼的AI优先战略及其对现代战争的启示：来自与伊朗冲突的经验教训

专知会员服务

8+阅读 · 6月15日

《网络空间兵棋推演：挑战、局限性与混合路径》报告

《网络空间兵棋推演：挑战、局限性与混合路径》报告

专知会员服务

8+阅读 · 6月15日

《离线语言支持系统：面向空战战术决策》

《离线语言支持系统：面向空战战术决策》

专知会员服务

8+阅读 · 6月15日

《以通信为中心的6G–LLM架构：面向可扩展的战术自主防御车辆网络》

《以通信为中心的6G–LLM架构：面向可扩展的战术自主防御车辆网络》

专知会员服务

7+阅读 · 6月15日

ICML 2026｜ECA：面向开放式图文生成的高效持续对齐

ICML 2026｜ECA：面向开放式图文生成的高效持续对齐

专知会员服务

6+阅读 · 6月14日

可信智能体AI综述：安全、鲁棒性、隐私与系统安全

可信智能体AI综述：安全、鲁棒性、隐私与系统安全

专知会员服务

6+阅读 · 6月14日

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

46+阅读 · 2020年10月31日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

多模态代码智能综述：从视觉输入到可执行代码系统

《面向导弹有效发射时机的监督机器学习方法：基于超视距空战仿真》

ICML 2026 | VOTP：用视频基础模型与最优传输，让离线偏好强化学习只需少量反馈

美国马六甲“三重网”概念：安全网、威慑网与杀伤网

相关资讯

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

专知

19+阅读 · 2018年5月31日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

相关论文

Reverse Knowledge Distillation: Training a Large Model using a Small One for Retinal Image Matching on Limited Data

Arxiv

0+阅读 · 2023年7月21日

Near-Linear Time Projection onto the $\ell_{1,\infty}$ Ball; Application to Sparse Autoencoders

Arxiv

0+阅读 · 2023年7月19日

Pseudo Outlier Exposure for Out-of-Distribution Detection using Pretrained Transformers

Pseudo Outlier Exposure for Out-of-Distribution Detection using Pretrained Transformers

Arxiv

0+阅读 · 2023年7月19日

Distilling Large Vision-Language Model with Out-of-Distribution Generalizability

Arxiv

0+阅读 · 2023年7月19日

GraphMoco:a Graph Momentum Contrast Model that Using Multimodel Structure Information for Large-scale Binary Function Representation Learning

GraphMoco:a Graph Momentum Contrast Model that Using Multimodel Structure Information for Large-scale Binary Function Representation Learning

Arxiv

0+阅读 · 2023年7月18日

MetAug: Contrastive Learning via Meta Feature Augmentation

Arxiv

10+阅读 · 2022年3月10日

Spatially Consistent Representation Learning

Arxiv

14+阅读 · 2021年3月10日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

Convolutional 2D Knowledge Graph Embeddings

Arxiv

29+阅读 · 2018年4月6日

相关基金

Sestrin2/AMPK信号通路调控新生鼠缺氧缺血脑损伤细胞自噬的新机制

国家自然科学基金

0+阅读 · 2015年12月31日

函数数据变换模型及降维方法的研究

国家自然科学基金

1+阅读 · 2015年12月31日

等离子体激励器表面电荷沉积机理与激励器优化设计

国家自然科学基金

0+阅读 · 2014年12月31日

高密度三维封装TSV电迁移可靠性机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

肾小球内皮细胞表达的swiprosin-1在早期糖尿病肾病发病中的作用及机制研究

国家自然科学基金

1+阅读 · 2014年12月31日

Notch3在非酒精性脂肪性肝病SREBPs信号通路中作用机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

氯化胆碱-尿素离子液体中电沉积高耐蚀Ni/Ni-Zn纳米多层膜的研究

国家自然科学基金

0+阅读 · 2013年12月31日

(Mn,Co)3O4阴极的晶面择优取向效应及氧还原反应动力学研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向三维高密度封装的Cu6Sn5单晶UBM的定向合成原理与可靠性研究

国家自然科学基金

0+阅读 · 2013年12月31日

近日节律与代谢疾病

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员