UWFormer: Underwater Image Enhancement via a Semi-Supervised Multi-Scale Transformer

Underwater images often exhibit poor quality, imbalanced coloration, and low contrast due to the complex and intricate interaction of light, water, and objects. Despite the significant contributions of previous underwater enhancement techniques, there exist several problems that demand further improvement: (i) Current deep learning methodologies depend on Convolutional Neural Networks (CNNs) that lack multi-scale enhancement and also have limited global perception fields. (ii) The scarcity of paired real-world underwater datasets poses a considerable challenge, and the utilization of synthetic image pairs risks overfitting. To address the aforementioned issues, this paper presents a Multi-scale Transformer-based Network called UWFormer for enhancing images at multiple frequencies via semi-supervised learning, in which we propose a Nonlinear Frequency-aware Attention mechanism and a Multi-Scale Fusion Feed-forward Network for low-frequency enhancement. Additionally, we introduce a specialized underwater semi-supervised training strategy, proposing a Subaqueous Perceptual Loss function to generate reliable pseudo labels. Experiments using full-reference and non-reference underwater benchmarks demonstrate that our method outperforms state-of-the-art methods in terms of both quantity and visual quality.

翻译：水下图像因光、水与物体之间复杂且精细的相互作用，常表现出质量差、色彩失衡及对比度低等问题。尽管现有水下增强技术已做出重要贡献，但仍存在若干有待改进的问题：（i）当前深度学习方法依赖卷积神经网络，缺乏多尺度增强能力，且全局感知域有限；（ii）配对的真实水下数据集稀缺构成重大挑战，而使用合成图像对则存在过拟合风险。针对上述问题，本文提出一种名为UWFormer的基于多尺度Transformer的网络，通过半监督学习实现多频率图像的增强，其中我们提出了一种非线性频率感知注意力机制和一种多尺度融合前馈网络用于低频增强。此外，我们引入了一种专门的水下半监督训练策略，提出一种水下感知损失函数以生成可靠的伪标签。基于全参考和无参考水下基准的实验表明，本方法在定量指标和视觉质量上均优于现有最优方法。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日