Self-supervised multi-frame methods have currently achieved promising results in depth estimation. However, these methods often suffer from mismatch problems due to the moving objects, which break the static assumption. Additionally, unfairness can occur when calculating photometric errors in high-freq or low-texture regions of the images. To address these issues, existing approaches use additional semantic priori black-box networks to separate moving objects and improve the model only at the loss level. Therefore, we propose FlowDepth, where a Dynamic Motion Flow Module (DMFM) decouples the optical flow by a mechanism-based approach and warps the dynamic regions thus solving the mismatch problem. For the unfairness of photometric errors caused by high-freq and low-texture regions, we use Depth-Cue-Aware Blur (DCABlur) and Cost-Volume sparsity loss respectively at the input and the loss level to solve the problem. Experimental results on the KITTI and Cityscapes datasets show that our method outperforms the state-of-the-art methods.
翻译:自监督多帧方法目前已在深度估计中取得了有前景的结果。然而,这些方法常因运动物体破坏静态假设而遭遇匹配问题。此外,在图像的高频或低纹理区域计算光度误差时,可能产生不公平性。为解决这些问题,现有方法使用额外的语义先验黑箱网络来分离运动物体,并仅在损失层面改进模型。为此,我们提出FlowDepth,其中动态运动流模块(DMFM)通过基于机制的途径解耦光流,并扭曲动态区域,从而解决匹配问题。针对高频和低纹理区域引起的光度误差不公平性,我们分别在输入层和损失层使用深度线索感知模糊(DCABlur)和代价体稀疏损失来解决此问题。在KITTI和Cityscapes数据集上的实验结果表明,我们的方法优于当前最先进的方法。