Fully-supervised shadow removal methods achieve the best restoration qualities on public datasets but still generate some shadow remnants. One of the reasons is the lack of large-scale shadow & shadow-free image pairs. Unsupervised methods can alleviate the issue but their restoration qualities are much lower than those of fully-supervised methods. In this work, we find that pretraining shadow removal networks on the image inpainting dataset can reduce the shadow remnants significantly: a naive encoder-decoder network gets competitive restoration quality w.r.t. the state-of-the-art methods via only 10% shadow & shadow-free image pairs. After analyzing networks with/without inpainting pre-training via the information stored in the weight (IIW), we find that inpainting pretraining improves restoration quality in non-shadow regions and enhances the generalization ability of networks significantly. Additionally, shadow removal fine-tuning enables networks to fill in the details of shadow regions. Inspired by these observations we formulate shadow removal as an adaptive fusion task that takes advantage of both shadow removal and image inpainting. Specifically, we develop an adaptive fusion network consisting of two encoders, an adaptive fusion block, and a decoder. The two encoders are responsible for extracting the feature from the shadow image and the shadow-masked image respectively. The adaptive fusion block is responsible for combining these features in an adaptive manner. Finally, the decoder converts the adaptive fused features to the desired shadow-free result. The extensive experiments show that our method empowered with inpainting outperforms all state-of-the-art methods.
翻译:全监督阴影去除方法在公共数据集上能达到最佳修复质量,但仍会产生部分阴影残留。其原因之一在于缺乏大规模成对的阴影与无阴影图像。无监督方法可缓解该问题,但其修复质量远低于全监督方法。本研究发现,在图像修补数据集上预训练阴影去除网络能显著减少阴影残留:使用仅10%的成对阴影与无阴影图像训练时,一个简单的编码器-解码器网络即可获得与现有最优方法相媲美的修复质量。通过分析权重存储信息(IIW)对比有无修补预训练的网络,我们发现修补预训练能提升非阴影区域的修复质量,并显著增强网络的泛化能力。此外,阴影去除微调使网络能够填充阴影区域细节。受此启发,我们将阴影去除建模为一种自适应融合任务,同时利用阴影去除与图像修补的优势。具体而言,我们开发了由双编码器、自适应融合模块和解码器组成的自适应融合网络:双编码器分别从阴影图像和阴影掩膜图像中提取特征,自适应融合模块以自适应方式整合这些特征,最终解码器将融合后的特征转化为期望的无阴影结果。大量实验表明,融合修补能力的本方法性能超越所有现有最优方法。