Underwater monocular depth estimation serves as the foundation for tasks such as 3D reconstruction of underwater scenes. However, due to the influence of light and medium, the underwater environment undergoes a distinctive imaging process, which presents challenges in accurately estimating depth from a single image. The existing methods fail to consider the unique characteristics of underwater environments, leading to inadequate estimation results and limited generalization performance. Furthermore, underwater depth estimation requires extracting and fusing both local and global features, which is not fully explored in existing methods. In this paper, an end-to-end learning framework for underwater monocular depth estimation called UMono is presented, which incorporates underwater image formation model characteristics into network architecture, and effectively utilize both local and global features of underwater image. Experimental results demonstrate that the proposed method is effective for underwater monocular depth estimation and outperforms the existing methods in both quantitative and qualitative analyses.
翻译:水下单目深度估计是水下场景三维重建等任务的基础。然而,由于光线与介质的影响,水下环境经历独特的成像过程,这为从单张图像准确估计深度带来了挑战。现有方法未能充分考虑水下环境的独有特性,导致估计结果不充分且泛化性能有限。此外,水下深度估计需要提取并融合局部与全局特征,而现有方法对此尚未充分探索。本文提出一种名为UMono的端到端水下单目深度估计学习框架,该框架将水下图像形成模型特性融入网络架构,并有效利用水下图像的局部与全局特征。实验结果表明,所提方法在水下单目深度估计中具有有效性,并在定量与定性分析中均优于现有方法。