Deep Optics for Video Snapshot Compressive Imaging

Video snapshot compressive imaging (SCI) aims to capture a sequence of video frames with only a single shot of a 2D detector, whose backbones rest in optical modulation patterns (also known as masks) and a computational reconstruction algorithm. Advanced deep learning algorithms and mature hardware are putting video SCI into practical applications. Yet, there are two clouds in the sunshine of SCI: i) low dynamic range as a victim of high temporal multiplexing, and ii) existing deep learning algorithms' degradation on real system. To address these challenges, this paper presents a deep optics framework to jointly optimize masks and a reconstruction network. Specifically, we first propose a new type of structural mask to realize motion-aware and full-dynamic-range measurement. Considering the motion awareness property in measurement domain, we develop an efficient network for video SCI reconstruction using Transformer to capture long-term temporal dependencies, dubbed Res2former. Moreover, sensor response is introduced into the forward model of video SCI to guarantee end-to-end model training close to real system. Finally, we implement the learned structural masks on a digital micro-mirror device. Experimental results on synthetic and real data validate the effectiveness of the proposed framework. We believe this is a milestone for real-world video SCI. The source code and data are available at https://github.com/pwangcs/DeepOpticsSCI.

翻译：视频快照压缩成像旨在仅通过二维探测器的单次拍摄捕获一帧视频序列，其核心依赖光学调制图案（亦称掩模）和计算重建算法。先进的深度学习算法与成熟硬件正推动视频SCI步入实际应用。然而，SCI领域仍面临两大挑战：i）高时间复用导致低动态范围；ii）现有深度学习算法在实际系统中性能退化。为应对这些挑战，本文提出一种深度光学框架，联合优化掩模与重建网络。具体而言，我们首先设计新型结构掩模，实现运动感知与全动态范围测量。鉴于测量域中的运动感知特性，我们构建高效视频SCI重建网络Res2former，利用Transformer捕捉长期时间依赖关系。此外，将传感器响应引入视频SCI前向模型，确保端到端模型训练贴近实际系统。最终，在数字微镜器件上实现学习所得结构掩模。合成数据与真实数据的实验结果验证了所提框架的有效性。我们认为这是迈向真实世界视频SCI的里程碑。源代码与数据见https://github.com/pwangcs/DeepOpticsSCI。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日