Despite significant advancements in network-based image harmonization techniques, there still exists a domain disparity between typical training pairs and real-world composites encountered during inference. Most existing methods are trained to reverse global edits made on segmented image regions, which fail to accurately capture the lighting inconsistencies between the foreground and background found in composited images. In this work, we introduce a self-supervised illumination harmonization approach formulated in the intrinsic image domain. First, we estimate a simple global lighting model from mid-level vision representations to generate a rough shading for the foreground region. A network then refines this inferred shading to generate a harmonious re-shading that aligns with the background scene. In order to match the color appearance of the foreground and background, we utilize ideas from prior harmonization approaches to perform parameterized image edits in the albedo domain. To validate the effectiveness of our approach, we present results from challenging real-world composites and conduct a user study to objectively measure the enhanced realism achieved compared to state-of-the-art harmonization methods.
翻译:尽管基于网络的图像协调技术取得了显著进展,但在训练时的典型配对图像与推理时遇到的实际合成图像之间仍存在领域差异。现有方法大多针对分割区域上的全局编辑进行逆向处理训练,难以准确捕捉合成图像中前景与背景间的光照不一致性。本文提出了一种基于内在图像域的自监督光照协调方法。首先,我们从中间层次视觉表征中估计一个简单的全局光照模型,为前景区域生成粗糙着色。随后,网络对该推断出的着色进行细化,生成与背景场景一致的重着色。为匹配前景与背景的色彩外观,我们借鉴现有协调方法的思想,在反照率域执行参数化图像编辑。为验证方法的有效性,我们展示了具有挑战性的现实合成图像结果,并通过用户研究客观评估了相较最新协调方法所实现的增强真实感效果。