DemMamba: Alignment-free Raw Video Demoireing with Frequency-assisted Spatio-Temporal Mamba

Moire patterns, resulting from the interference of two similar repetitive patterns, are frequently observed during the capture of images or videos on screens. These patterns vary in color, shape, and location across video frames, posing challenges in extracting information from adjacent frames and preserving temporal consistency throughout the restoration process. Existing deep learning methods often depend on well-designed alignment modules, such as optical flow estimation, deformable convolution, and cross-frame self-attention layers, incurring high computational costs. Recent studies indicate that utilizing raw data as input can significantly improve the effectiveness of video demoireing by providing the pristine degradation information and more detailed content. However, previous works fail to design both efficient and effective raw video demoireing methods that can maintain temporal consistency and prevent degradation of color and spatial details. This paper introduces a novel alignment-free raw video demoireing network with frequency-assisted spatio-temporal Mamba (DemMamba). It features sequentially arranged Spatial Mamba Blocks (SMB) and Temporal Mamba Blocks (TMB) to effectively model the inter- and intra-relationships in raw videos affected by moire patterns. An Adaptive Frequency Block (AFB) within the SMB facilitates demoireing in the frequency domain, while a Channel Attention Block (CAB) in the TMB enhances the temporal information interactions by leveraging inter-channel relationships among features. Extensive experiments demonstrate that our proposed DemMamba surpasses state-of-the-art methods by 1.3 dB in PSNR, and also provides a satisfactory visual experience.

翻译：摩尔纹是由两个相似的重复图案相互干涉产生的，在拍摄屏幕上的图像或视频时经常出现。这些图案在视频帧之间颜色、形状和位置各异，给从相邻帧提取信息以及在修复过程中保持时间一致性带来了挑战。现有的深度学习方法通常依赖于精心设计的对齐模块，例如光流估计、可变形卷积和跨帧自注意力层，导致计算成本高昂。近期研究表明，使用原始数据作为输入可以通过提供原始的退化信息和更详细的内容，显著提升视频去摩尔纹的效果。然而，先前的工作未能设计出既高效又有效的原始视频去摩尔纹方法，这些方法需要保持时间一致性并防止颜色和空间细节的退化。本文提出了一种新颖的、无需对齐的原始视频去摩尔纹网络，它采用频率辅助的时空Mamba（DemMamba）。该网络依次排列了空间Mamba模块（SMB）和时间Mamba模块（TMB），以有效建模受摩尔纹影响的原始视频中的帧间和帧内关系。SMB中的自适应频率模块（AFB）促进了频域的去摩尔纹处理，而TMB中的通道注意力模块（CAB）则通过利用特征间的通道关系来增强时间信息交互。大量实验表明，我们提出的DemMamba在PSNR指标上超越了现有最先进方法1.3 dB，同时也提供了令人满意的视觉体验。