Image harmonization aims to solve the visual inconsistency problem in composited images by adaptively adjusting the foreground pixels with the background as references. Existing methods employ local color transformation or region matching between foreground and background, which neglects powerful proximity prior and independently distinguishes fore-/back-ground as a whole part for harmonization. As a result, they still show a limited performance across varied foreground objects and scenes. To address this issue, we propose a novel Global-aware Kernel Network (GKNet) to harmonize local regions with comprehensive consideration of long-distance background references. Specifically, GKNet includes two parts, \ie, harmony kernel prediction and harmony kernel modulation branches. The former includes a Long-distance Reference Extractor (LRE) to obtain long-distance context and Kernel Prediction Blocks (KPB) to predict multi-level harmony kernels by fusing global information with local features. To achieve this goal, a novel Selective Correlation Fusion (SCF) module is proposed to better select relevant long-distance background references for local harmonization. The latter employs the predicted kernels to harmonize foreground regions with both local and global awareness. Abundant experiments demonstrate the superiority of our method for image harmonization over state-of-the-art methods, \eg, achieving 39.53dB PSNR that surpasses the best counterpart by +0.78dB $\uparrow$; decreasing fMSE/MSE by 11.5\%$\downarrow$/6.7\%$\downarrow$ compared with the SoTA method. Code will be available at \href{https://github.com/XintianShen/GKNet}{here}.
翻译:图像和谐化旨在通过以背景为参考自适应调整前景像素,解决合成图像中的视觉不一致性问题。现有方法采用局部颜色变换或前景与背景的区域匹配,忽略强大的邻近先验信息,并将前景/背景作为整体独立区分以进行和谐化。因此,它们在不同前景物体和场景下的表现仍有限。为解决该问题,我们提出一种新颖的全局感知核网络(GKNet),通过全面考虑远距离背景参考来协调局部区域。具体而言,GKNet包含两部分,即和谐核预测分支与和谐核调制分支。前者包含远距离参考提取器(LRE)以获取远距离上下文,以及核预测块(KPB)通过融合全局信息与局部特征预测多级和谐核。为此,提出一种新颖的选择性相关融合(SCF)模块,以更好地选择相关远距离背景参考用于局部和谐化。后者利用预测的核以局部与全局感知方式协调前景区域。大量实验表明,我们的方法在图像和谐化上优于现有最先进方法,例如达到39.53dB PSNR,超过最佳对比方法+0.78dB ↑;与现有最佳方法相比,fMSE/MSE分别降低11.5%↓/6.7%↓。代码将在\href{https://github.com/XintianShen/GKNet}{此处}提供。