Atmospheric turbulence severely degrades video quality by introducing distortions such as geometric warping, blur, and temporal flickering, posing significant challenges to both visual clarity and temporal consistency. Current state-of-the-art methods are based on transformer, 3D architectures and require multi-frame input, but their large computational cost and memory usage limit real-time deployment, especially in resource-constrained scenarios. In this work, we propose ReMATF, a lightweight recurrent framework that restores videos using only two frames at a time while preserving spatial detail and temporal stability. ReMATF combines a multi-scale encoder-decoder with temporal warping and a motion-adaptive temporal fusion module that performs per-pixel fusion between the warped previous output and the current prediction to enhance coherence without enlarging the temporal window. This design reduces flicker, sharpens details, and remains efficient. Experiments on synthetic and real turbulence datasets show consistent improvements in PSNR/SSIM and perceptual quality (LPIPS), along with substantially faster inference than multi-frame transformer baselines, making ReMATF suitable turbulence mitigation in resource-constrained scenarios.
翻译:大气湍流通过引入几何畸变、模糊及时间闪烁等失真严重退化视频质量,对视觉清晰度与时间一致性构成重大挑战。现有最先进方法基于Transformer和3D架构,需要多帧输入,但其巨大的计算开销与内存占用限制了实时部署,尤其是在资源受限场景中。本文提出ReMATF——一种轻量级循环框架,仅需同时处理两帧即可恢复视频,同时保持空间细节与时间稳定性。ReMATF结合多尺度编码器-解码器、时间扭曲模块以及运动自适应时间融合模块,该模块对扭曲后的先前输出与当前预测执行逐像素融合,在不扩大时间窗口的前提下增强时序一致性。该设计可减少闪烁、锐化细节并保持高效性。在合成与真实湍流数据集上的实验表明,ReMATF在PSNR/SSIM与感知质量(LPIPS)上持续提升,且推理速度显著快于多帧Transformer基线模型,使其适用于资源受限场景下的湍流抑制。