Despite significant advancements in network-based image harmonization techniques, there still exists a domain disparity between typical training pairs and real-world composites encountered during inference. Most existing methods are trained to reverse global edits made on segmented image regions, which fail to accurately capture the lighting inconsistencies between the foreground and background found in composited images. In this work, we introduce a self-supervised illumination harmonization approach formulated in the intrinsic image domain. First, we estimate a simple global lighting model from mid-level vision representations to generate a rough shading for the foreground region. A network then refines this inferred shading to generate a harmonious re-shading that aligns with the background scene. In order to match the color appearance of the foreground and background, we utilize ideas from prior harmonization approaches to perform parameterized image edits in the albedo domain. To validate the effectiveness of our approach, we present results from challenging real-world composites and conduct a user study to objectively measure the enhanced realism achieved compared to state-of-the-art harmonization methods.
翻译:尽管基于网络的图像和谐化技术取得了显著进展,但在推理过程中遇到的典型训练对与真实世界合成图像之间仍存在领域差异。大多数现有方法训练用于逆转对分割图像区域所做的全局编辑,这无法准确捕捉合成图像中前景与背景之间的光照不一致性。在本文中,我们提出了一种基于内在图像域的自监督光照和谐化方法。首先,我们从中间视觉表示中估计一个简单的全局光照模型,为前景区域生成粗略的阴影。然后,一个网络优化这一推断出的阴影,以生成与背景场景对齐的和谐化重阴影。为了匹配前景和背景的颜色外观,我们利用先前和谐化方法的思想,在反照率域中执行参数化图像编辑。为验证我们方法的有效性,我们展示了来自具有挑战性的真实世界合成图像的结果,并开展了一项用户研究,以客观衡量与最先进的和谐化方法相比所实现的增强真实性。