Diversify Your Vision Datasets with Automatic Diffusion-Based Augmentation - 专知论文

会员服务 ·

0

训练数据 · 数据集 · Vision · 原点 · MoDELS ·

2023 年 5 月 25 日

Diversify Your Vision Datasets with Automatic Diffusion-Based Augmentation

翻译：多样化你的视觉数据集：基于自动扩散增强的方法

Lisa Dunlap,Alyssa Umino,Han Zhang,Jiezhi Yang,Joseph E. Gonzalez,Trevor Darrell

Many fine-grained classification tasks, like rare animal identification, have limited training data and consequently classifiers trained on these datasets often fail to generalize to variations in the domain like changes in weather or location. As such, we explore how natural language descriptions of the domains seen in training data can be used with large vision models trained on diverse pretraining datasets to generate useful variations of the training data. We introduce ALIA (Automated Language-guided Image Augmentation), a method which utilizes large vision and language models to automatically generate natural language descriptions of a dataset's domains and augment the training data via language-guided image editing. To maintain data integrity, a model trained on the original dataset filters out minimal image edits and those which corrupt class-relevant information. The resulting dataset is visually consistent with the original training data and offers significantly enhanced diversity. On fine-grained and cluttered datasets for classification and detection, ALIA surpasses traditional data augmentation and text-to-image generated data by up to 15\%, often even outperforming equivalent additions of real data. Code is avilable at https://github.com/lisadunlap/ALIA.

翻译：许多细粒度分类任务（例如稀有动物识别）的训练数据有限，因此基于这些数据集训练的分类器往往难以泛化到领域变化（如天气或地点改变）。为此，我们探索如何利用训练数据中可见领域的自然语言描述，结合在多样化预训练数据集上训练的大规模视觉模型，生成训练数据的有用变体。我们提出ALIA（自动语言引导图像增强）方法，该方法利用大规模视觉与语言模型自动生成数据集领域的自然语言描述，并通过语言引导的图像编辑增强训练数据。为确保数据完整性，基于原始数据集训练的模型会过滤掉微小图像编辑以及破坏类别相关信息的编辑。生成的图像在视觉上与原始训练数据一致，同时显著提升多样性。在细粒度与杂乱数据集的分类和检测任务中，ALIA 相比传统数据增强和文本生成图像方法提升高达15%，甚至常优于等量真实数据的添加。代码已开源在 https://github.com/lisadunlap/ALIA。

0

相关内容

训练数据

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

128+阅读 · 2022年4月21日

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

图挖掘与多关系学习，亚马逊与CMU-WWW2021教程，附161页ppt

专知会员服务

37+阅读 · 2021年4月20日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

粮食进口来源布局对中国粮食进口波动的影响研究——基于地域季节差异视角

国家自然科学基金

0+阅读 · 2015年12月31日

空气阻尼和残余应力对平面悬空薄膜振动特性的影响研究

国家自然科学基金

0+阅读 · 2015年12月31日

亚麻产量相关性状的全基因组关联分析

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

集成于微流控的生物纳米微粒尺度检测

国家自然科学基金

0+阅读 · 2012年12月31日

TGF-β/Smads信号通路在干细胞移植治疗化疗损伤性卵巢早衰中的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

铁电材料对FePt薄膜垂直磁各向异性调控机理的研究

国家自然科学基金

0+阅读 · 2012年12月31日

Klotho蛋白在缺血再灌注急性肾损伤中的抗氧化应激机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

拟南芥CSN5B与VTC1互作的重要氨基酸对维生素C生物合成的影响

国家自然科学基金

0+阅读 · 2011年12月31日

基于分布式水文模型的流域尺度土壤湿度遥感数据同化研究

国家自然科学基金

0+阅读 · 2009年12月31日

HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models

Arxiv

0+阅读 · 2023年7月13日

Domain-Agnostic Tuning-Encoder for Fast Personalization of Text-To-Image Models

Arxiv

0+阅读 · 2023年7月13日

Leveraging Vision-Language Foundation Models for Fine-Grained Downstream Tasks

Leveraging Vision-Language Foundation Models for Fine-Grained Downstream Tasks

Arxiv

0+阅读 · 2023年7月13日

Improving 2D Human Pose Estimation across Unseen Camera Views with Synthetic Data

Arxiv

0+阅读 · 2023年7月13日

Sem-CS: Semantic CLIPStyler for Text-Based Image Style Transfer

Arxiv

0+阅读 · 2023年7月12日

Masked Vision and Language Pre-training with Unimodal and Multimodal Contrastive Losses for Medical Visual Question Answering

Arxiv

0+阅读 · 2023年7月11日

Exploring Image Augmentations for Siamese Representation Learning with Chest X-Rays

Arxiv

0+阅读 · 2023年7月10日

Semi-supervised Medical Image Segmentation through Dual-task Consistency

Arxiv

14+阅读 · 2020年9月9日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

Arxiv

12+阅读 · 2020年2月19日

VIP会员

文章信息

相关主题

最新内容

ICML 2026 | CFPO：用反事实策略优化提升多模态推理

ICML 2026 | CFPO：用反事实策略优化提升多模态推理

专知会员服务

0+阅读 · 13分钟前

综述 | 世界动作模型：少做梦，多行动

综述 | 世界动作模型：少做梦，多行动

专知会员服务

0+阅读 · 15分钟前

美以伊冲突：无人机与人工智能的运用

美以伊冲突：无人机与人工智能的运用

专知会员服务

1+阅读 · 27分钟前

《战时图神经网络：整合以色列-伊朗冲突中的网络安全与无人机智能》最新50页文献

《战时图神经网络：整合以色列-伊朗冲突中的网络安全与无人机智能》最新50页文献

专知会员服务

1+阅读 · 38分钟前

《特种部队在透明战场中的生存力》最新报告

《特种部队在透明战场中的生存力》最新报告

专知会员服务

1+阅读 · 47分钟前

《自主无人机蜂群协同与控制系统：人工智能赋能的战场协同与自主任务编排平台》

《自主无人机蜂群协同与控制系统：人工智能赋能的战场协同与自主任务编排平台》

专知会员服务

1+阅读 · 51分钟前

《人工智能生成的零日漏洞：对未来作战的影响》

《人工智能生成的零日漏洞：对未来作战的影响》

专知会员服务

1+阅读 · 55分钟前

《理解伙伴国在防务能力选择中的偏好：探索美国解决方案的替代选择》美智库200页报告

《理解伙伴国在防务能力选择中的偏好：探索美国解决方案的替代选择》美智库200页报告

专知会员服务

1+阅读 · 59分钟前

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

专知会员服务

5+阅读 · 6月22日

综述 | 3D场景图：开放挑战与未来方向

综述 | 3D场景图：开放挑战与未来方向

专知会员服务

8+阅读 · 6月22日

《国防工业6.0：全自主作战系统、量子-人工智能融合与新一代战略威慑》

《国防工业6.0：全自主作战系统、量子-人工智能融合与新一代战略威慑》

专知会员服务

6+阅读 · 6月22日

21世纪的无人机战争

21世纪的无人机战争

专知会员服务

4+阅读 · 6月22日

《伊朗与以色列-美国热战及其对数字技术的影响》

《伊朗与以色列-美国热战及其对数字技术的影响》

专知会员服务

5+阅读 · 6月22日

《量子技术的军事任务技术适配与利用》

《量子技术的军事任务技术适配与利用》

专知会员服务

5+阅读 · 6月22日

《美国陆军军官学校（西点军校）本科生科研中生成式人工智能的使用》

《美国陆军军官学校（西点军校）本科生科研中生成式人工智能的使用》

专知会员服务

8+阅读 · 6月22日

相关VIP内容

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

128+阅读 · 2022年4月21日

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

图挖掘与多关系学习，亚马逊与CMU-WWW2021教程，附161页ppt

专知会员服务

37+阅读 · 2021年4月20日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

综述 | 世界动作模型：少做梦，多行动

《战时图神经网络：整合以色列-伊朗冲突中的网络安全与无人机智能》最新50页文献

ICML 2026 | CFPO：用反事实策略优化提升多模态推理

美以伊冲突：无人机与人工智能的运用

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models

Arxiv

0+阅读 · 2023年7月13日

Domain-Agnostic Tuning-Encoder for Fast Personalization of Text-To-Image Models

Arxiv

0+阅读 · 2023年7月13日

Leveraging Vision-Language Foundation Models for Fine-Grained Downstream Tasks

Leveraging Vision-Language Foundation Models for Fine-Grained Downstream Tasks

Arxiv

0+阅读 · 2023年7月13日

Improving 2D Human Pose Estimation across Unseen Camera Views with Synthetic Data

Arxiv

0+阅读 · 2023年7月13日

Sem-CS: Semantic CLIPStyler for Text-Based Image Style Transfer

Arxiv

0+阅读 · 2023年7月12日

Masked Vision and Language Pre-training with Unimodal and Multimodal Contrastive Losses for Medical Visual Question Answering

Arxiv

0+阅读 · 2023年7月11日

Exploring Image Augmentations for Siamese Representation Learning with Chest X-Rays

Arxiv

0+阅读 · 2023年7月10日

Semi-supervised Medical Image Segmentation through Dual-task Consistency

Arxiv

14+阅读 · 2020年9月9日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

Arxiv

12+阅读 · 2020年2月19日

相关基金

粮食进口来源布局对中国粮食进口波动的影响研究——基于地域季节差异视角

国家自然科学基金

0+阅读 · 2015年12月31日

空气阻尼和残余应力对平面悬空薄膜振动特性的影响研究

国家自然科学基金

0+阅读 · 2015年12月31日

亚麻产量相关性状的全基因组关联分析

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

集成于微流控的生物纳米微粒尺度检测

国家自然科学基金

0+阅读 · 2012年12月31日

TGF-β/Smads信号通路在干细胞移植治疗化疗损伤性卵巢早衰中的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

铁电材料对FePt薄膜垂直磁各向异性调控机理的研究

国家自然科学基金

0+阅读 · 2012年12月31日

Klotho蛋白在缺血再灌注急性肾损伤中的抗氧化应激机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

拟南芥CSN5B与VTC1互作的重要氨基酸对维生素C生物合成的影响

国家自然科学基金

0+阅读 · 2011年12月31日

基于分布式水文模型的流域尺度土壤湿度遥感数据同化研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员