Learning Physical-Spatio-Temporal Features for Video Shadow Removal

Shadow removal in a single image has received increasing attention in recent years. However, removing shadows over dynamic scenes remains largely under-explored. In this paper, we propose the first data-driven video shadow removal model, termed PSTNet, by exploiting three essential characteristics of video shadows, i.e., physical property, spatio relation, and temporal coherence. Specifically, a dedicated physical branch was established to conduct local illumination estimation, which is more applicable for scenes with complex lighting and textures, and then enhance the physical features via a mask-guided attention strategy. Then, we develop a progressive aggregation module to enhance the spatio and temporal characteristics of features maps, and effectively integrate the three kinds of features. Furthermore, to tackle the lack of datasets of paired shadow videos, we synthesize a dataset (SVSRD-85) with aid of the popular game GTAV by controlling the switch of the shadow renderer. Experiments against 9 state-of-the-art models, including image shadow removers and image/video restoration methods, show that our method improves the best SOTA in terms of RMSE error for the shadow area by 14.7. In addition, we develop a lightweight model adaptation strategy to make our synthetic-driven model effective in real world scenes. The visual comparison on the public SBU-TimeLapse dataset verifies the generalization ability of our model in real scenes.

翻译：单幅图像中的阴影去除近年来受到了越来越多的关注。然而，动态场景中的阴影去除在很大程度上仍未得到充分探索。本文提出了首个数据驱动的视频阴影去除模型，命名为PSTNet，通过利用视频阴影的三个基本特性，即物理属性、空间关系和时间一致性。具体而言，我们建立了一个专门的物理分支，用于进行局部光照估计，这更适用于具有复杂光照和纹理的场景，并通过掩码引导的注意力策略增强物理特征。随后，我们开发了一个渐进式聚合模块，以增强特征图的空间和时间特性，并有效整合三类特征。此外，为解决配对阴影视频数据集的缺乏问题，我们借助流行游戏GTAV，通过控制阴影渲染器的开关合成了一个数据集（SVSRD-85）。与9个最先进模型的实验对比，包括图像阴影去除方法和图像/视频恢复方法，表明我们的方法在阴影区域的RMSE误差方面将最佳现有技术水平提升了14.7。此外，我们开发了一种轻量级模型自适应策略，使我们的合成驱动模型在真实场景中有效。在公共SBU-TimeLapse数据集上的视觉对比验证了我们的模型在真实场景中的泛化能力。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

105+阅读 · 2022年2月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日