L-CAD: Language-based Colorization with Any-level Descriptions - 专知论文

会员服务 ·

0

Color · MoDELS · Extensibility · Guidance · Performer ·

2023 年 5 月 26 日

L-CAD: Language-based Colorization with Any-level Descriptions

翻译：L-CAD：基于任意层级描述的语言驱动图像着色

Zheng Chang,Shuchen Weng,Peixuan Zhang,Yu Li,Si Li,Boxin Shi

Language-based colorization produces plausible and visually pleasing colors under the guidance of user-friendly natural language descriptions. Previous methods implicitly assume that users provide comprehensive color descriptions for most of the objects in the image, which leads to suboptimal performance. In this paper, we propose a unified model to perform language-based colorization with any-level descriptions. We leverage the pretrained cross-modality generative model for its robust language understanding and rich color priors to handle the inherent ambiguity of any-level descriptions. We further design modules to align with input conditions to preserve local spatial structures and prevent the ghosting effect. With the proposed novel sampling strategy, our model achieves instance-aware colorization in diverse and complex scenarios. Extensive experimental results demonstrate our advantages of effectively handling any-level descriptions and outperforming both language-based and automatic colorization methods. The code and pretrained models are available at: https://github.com/changzheng123/L-CAD.

翻译：基于语言的图像着色技术能在用户友好的自然语言描述引导下生成合理且视觉愉悦的色彩。现有方法隐含假设用户会为图像中大多数物体提供完整的颜色描述，导致性能欠佳。本文提出一种统一模型，支持基于任意层级描述的语言驱动着色。我们利用预训练的跨模态生成模型，凭借其强大的语言理解能力和丰富的颜色先验，以处理任意层级描述固有的歧义性。进一步设计模块与输入条件对齐，以保持局部空间结构并避免鬼影效应。通过提出的新型采样策略，我们的模型能在多样复杂场景中实现实例感知着色。大量实验结果表明，本方法能有效处理任意层级描述，在语言驱动和自动着色方法中均表现优异。代码与预训练模型已开源：https://github.com/changzheng123/L-CAD。

0

相关内容

Color

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

46+阅读 · 2020年10月31日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

专知会员服务

65+阅读 · 2020年5月12日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

「Github」多模态机器学习文章阅读列表

「Github」多模态机器学习文章阅读列表

专知

124+阅读 · 2019年8月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

考虑硅通孔热应力的静态时序分析

国家自然科学基金

0+阅读 · 2015年12月31日

VIP中间神经元失抑制效应在MCD痫性放电自限性受损中的作用机制

国家自然科学基金

0+阅读 · 2014年12月31日

空化热流体动力特性与机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

mirPS细胞与内皮祖细胞移植共同促进脑缺血损伤修复

国家自然科学基金

0+阅读 · 2013年12月31日

控释VEGF/NT-3脊髓脱细胞支架在SCI模型中的血管化及神经再生研究

国家自然科学基金

0+阅读 · 2013年12月31日

Cdk5介导的Drp1磷酸化对线粒体的调控及其在阿尔茨海默病中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

MEMRI追踪骨髓源神经干细胞移植后与宿主脑组织功能重建实验研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

Bootstrapping Vision-Language Learning with Decoupled Language Pre-training

Arxiv

0+阅读 · 2023年7月13日

AutoHint: Automatic Prompt Optimization with Hint Generation

Arxiv

0+阅读 · 2023年7月13日

T2I-CompBench: A Comprehensive Benchmark for Open-world Compositional Text-to-image Generation

Arxiv

0+阅读 · 2023年7月12日

Line Art Colorization of Fakemon using Generative Adversarial Neural Networks

Arxiv

0+阅读 · 2023年7月11日

Learning Imbalanced Data with Vision Transformers

Arxiv

11+阅读 · 2023年3月8日

Natural Language Descriptions of Deep Visual Features

Arxiv

12+阅读 · 2022年1月26日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

A Decade Survey of Content Based Image Retrieval using Deep Learning

Arxiv

23+阅读 · 2020年11月23日

Text Generation from Knowledge Graphs with Graph Transformers

Arxiv

35+阅读 · 2019年4月4日

VIP会员

文章信息

相关主题

最新内容

ICML 2026 | CFPO：用反事实策略优化提升多模态推理

ICML 2026 | CFPO：用反事实策略优化提升多模态推理

专知会员服务

2+阅读 · 6月23日

综述 | 世界动作模型：少做梦，多行动

综述 | 世界动作模型：少做梦，多行动

专知会员服务

2+阅读 · 6月23日

美以伊冲突：无人机与人工智能的运用

美以伊冲突：无人机与人工智能的运用

专知会员服务

4+阅读 · 6月23日

《战时图神经网络：整合以色列-伊朗冲突中的网络安全与无人机智能》最新50页文献

《战时图神经网络：整合以色列-伊朗冲突中的网络安全与无人机智能》最新50页文献

专知会员服务

3+阅读 · 6月23日

《特种部队在透明战场中的生存力》最新报告

《特种部队在透明战场中的生存力》最新报告

专知会员服务

3+阅读 · 6月23日

《自主无人机蜂群协同与控制系统：人工智能赋能的战场协同与自主任务编排平台》

《自主无人机蜂群协同与控制系统：人工智能赋能的战场协同与自主任务编排平台》

专知会员服务

3+阅读 · 6月23日

《人工智能生成的零日漏洞：对未来作战的影响》

《人工智能生成的零日漏洞：对未来作战的影响》

专知会员服务

3+阅读 · 6月23日

《理解伙伴国在防务能力选择中的偏好：探索美国解决方案的替代选择》美智库200页报告

《理解伙伴国在防务能力选择中的偏好：探索美国解决方案的替代选择》美智库200页报告

专知会员服务

2+阅读 · 6月23日

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

专知会员服务

5+阅读 · 6月22日

综述 | 3D场景图：开放挑战与未来方向

综述 | 3D场景图：开放挑战与未来方向

专知会员服务

8+阅读 · 6月22日

《国防工业6.0：全自主作战系统、量子-人工智能融合与新一代战略威慑》

《国防工业6.0：全自主作战系统、量子-人工智能融合与新一代战略威慑》

专知会员服务

7+阅读 · 6月22日

21世纪的无人机战争

21世纪的无人机战争

专知会员服务

4+阅读 · 6月22日

《伊朗与以色列-美国热战及其对数字技术的影响》

《伊朗与以色列-美国热战及其对数字技术的影响》

专知会员服务

5+阅读 · 6月22日

《量子技术的军事任务技术适配与利用》

《量子技术的军事任务技术适配与利用》

专知会员服务

5+阅读 · 6月22日

《美国陆军军官学校（西点军校）本科生科研中生成式人工智能的使用》

《美国陆军军官学校（西点军校）本科生科研中生成式人工智能的使用》

专知会员服务

8+阅读 · 6月22日

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

46+阅读 · 2020年10月31日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

专知会员服务

65+阅读 · 2020年5月12日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

热门VIP内容

开通专知VIP会员享更多权益服务

综述 | 世界动作模型：少做梦，多行动

《战时图神经网络：整合以色列-伊朗冲突中的网络安全与无人机智能》最新50页文献

ICML 2026 | CFPO：用反事实策略优化提升多模态推理

美以伊冲突：无人机与人工智能的运用

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

「Github」多模态机器学习文章阅读列表

「Github」多模态机器学习文章阅读列表

专知

124+阅读 · 2019年8月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

相关论文

Bootstrapping Vision-Language Learning with Decoupled Language Pre-training

Arxiv

0+阅读 · 2023年7月13日

AutoHint: Automatic Prompt Optimization with Hint Generation

Arxiv

0+阅读 · 2023年7月13日

T2I-CompBench: A Comprehensive Benchmark for Open-world Compositional Text-to-image Generation

Arxiv

0+阅读 · 2023年7月12日

Line Art Colorization of Fakemon using Generative Adversarial Neural Networks

Arxiv

0+阅读 · 2023年7月11日

Learning Imbalanced Data with Vision Transformers

Arxiv

11+阅读 · 2023年3月8日

Natural Language Descriptions of Deep Visual Features

Arxiv

12+阅读 · 2022年1月26日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

A Decade Survey of Content Based Image Retrieval using Deep Learning

Arxiv

23+阅读 · 2020年11月23日

Text Generation from Knowledge Graphs with Graph Transformers

Arxiv

35+阅读 · 2019年4月4日

相关基金

考虑硅通孔热应力的静态时序分析

国家自然科学基金

0+阅读 · 2015年12月31日

VIP中间神经元失抑制效应在MCD痫性放电自限性受损中的作用机制

国家自然科学基金

0+阅读 · 2014年12月31日

空化热流体动力特性与机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

mirPS细胞与内皮祖细胞移植共同促进脑缺血损伤修复

国家自然科学基金

0+阅读 · 2013年12月31日

控释VEGF/NT-3脊髓脱细胞支架在SCI模型中的血管化及神经再生研究

国家自然科学基金

0+阅读 · 2013年12月31日

Cdk5介导的Drp1磷酸化对线粒体的调控及其在阿尔茨海默病中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

MEMRI追踪骨髓源神经干细胞移植后与宿主脑组织功能重建实验研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员