Painterly Image Harmonization using Diffusion Model

Painterly image harmonization aims to insert photographic objects into paintings and obtain artistically coherent composite images. Previous methods for this task mainly rely on inference optimization or generative adversarial network, but they are either very time-consuming or struggling at fine control of the foreground objects (e.g., texture and content details). To address these issues, we propose a novel Painterly Harmonization stable Diffusion model (PHDiffusion), which includes a lightweight adaptive encoder and a Dual Encoder Fusion (DEF) module. Specifically, the adaptive encoder and the DEF module first stylize foreground features within each encoder. Then, the stylized foreground features from both encoders are combined to guide the harmonization process. During training, besides the noise loss in diffusion model, we additionally employ content loss and two style losses, i.e., AdaIN style loss and contrastive style loss, aiming to balance the trade-off between style migration and content preservation. Compared with the state-of-the-art models from related fields, our PHDiffusion can stylize the foreground more sufficiently and simultaneously retain finer content. Our code and model are available at https://github.com/bcmi/PHDiffusion-Painterly-Image-Harmonization.

翻译：绘画风格的图像融合旨在将摄影对象插入到绘画中，获得艺术上协调的合成图像。以往的方法主要依赖推理优化或生成对抗网络，但它们要么非常耗时，要么难以精细控制前景对象（如纹理和内容细节）。为解决这些问题，我们提出一种新颖的绘画风格融合的稳定扩散模型（PHDiffusion），包含轻量级自适应编码器和双编码器融合模块。具体而言，自适应编码器和双编码器融合模块首先在每个编码器内对前景特征进行风格化，然后将两个编码器的风格化前景特征结合以引导融合过程。训练期间，除扩散模型的噪声损失外，我们额外引入内容损失和两种风格损失（即AdaIN风格损失和对比风格损失），旨在平衡风格迁移与内容保持之间的权衡。与相关领域的最先进模型相比，我们的PHDiffusion能够更充分地风格化前景，同时保留更精细的内容。我们的代码和模型可通过 https://github.com/bcmi/PHDiffusion-Painterly-Image-Harmonization 获取。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日