Joint Perceptual Learning for Enhancement and Object Detection in Underwater Scenarios

Underwater degraded images greatly challenge existing algorithms to detect objects of interest. Recently, researchers attempt to adopt attention mechanisms or composite connections for improving the feature representation of detectors. However, this solution does \textit{not} eliminate the impact of degradation on image content such as color and texture, achieving minimal improvements. Another feasible solution for underwater object detection is to develop sophisticated deep architectures in order to enhance image quality or features. Nevertheless, the visually appealing output of these enhancement modules do \textit{not} necessarily generate high accuracy for deep detectors. More recently, some multi-task learning methods jointly learn underwater detection and image enhancement, accessing promising improvements. Typically, these methods invoke huge architecture and expensive computations, rendering inefficient inference. Definitely, underwater object detection and image enhancement are two interrelated tasks. Leveraging information coming from the two tasks can benefit each task. Based on these factual opinions, we propose a bilevel optimization formulation for jointly learning underwater object detection and image enhancement, and then unroll to a dual perception network (DPNet) for the two tasks. DPNet with one shared module and two task subnets learns from the two different tasks, seeking a shared representation. The shared representation provides more structural details for image enhancement and rich content information for object detection. Finally, we derive a cooperative training strategy to optimize parameters for DPNet. Extensive experiments on real-world and synthetic underwater datasets demonstrate that our method outputs visually favoring images and higher detection accuracy.

翻译：水下退化图像严重挑战现有算法对感兴趣目标的检测能力。近期，研究者尝试采用注意力机制或复合连接来改进检测器的特征表示。然而，该方法未能消除退化对图像内容（如颜色和纹理）的影响，改善效果有限。水下目标检测的另一可行方案是开发复杂的深度架构以提升图像质量或特征。尽管如此，这些增强模块的视觉友好输出并不必然为深度检测器带来高精度。近来，部分多任务学习方法联合学习水下检测与图像增强，取得了显著改进。典型地，这些方法需要庞大架构与高昂计算成本，导致推理效率低下。事实上，水下目标检测与图像增强是两项相互关联的任务，利用两者的信息可互惠互利。基于这些事实，我们提出一种双层优化公式，用于联合学习水下目标检测与图像增强，并展开为双感知网络（DPNet）以处理这两个任务。DPNet由一个共享模块与两个任务子网构成，从两个不同任务中学习以寻求共享表示。该共享表示为图像增强提供更多结构细节，为目标检测提供丰富内容信息。最后，我们推导出一种协同训练策略来优化DPNet参数。在真实与合成水下数据集上的大量实验表明，我们的方法能输出视觉偏好的图像并实现更高检测精度。