Real depth super-resolution (DSR), unlike synthetic settings, is a challenging task due to the structural distortion and the edge noise caused by the natural degradation in real-world low-resolution (LR) depth maps. These defeats result in significant structure inconsistency between the depth map and the RGB guidance, which potentially confuses the RGB-structure guidance and thereby degrades the DSR quality. In this paper, we propose a novel structure flow-guided DSR framework, where a cross-modality flow map is learned to guide the RGB-structure information transferring for precise depth upsampling. Specifically, our framework consists of a cross-modality flow-guided upsampling network (CFUNet) and a flow-enhanced pyramid edge attention network (PEANet). CFUNet contains a trilateral self-attention module combining both the geometric and semantic correlations for reliable cross-modality flow learning. Then, the learned flow maps are combined with the grid-sampling mechanism for coarse high-resolution (HR) depth prediction. PEANet targets at integrating the learned flow map as the edge attention into a pyramid network to hierarchically learn the edge-focused guidance feature for depth edge refinement. Extensive experiments on real and synthetic DSR datasets verify that our approach achieves excellent performance compared to state-of-the-art methods.
翻译:真实深度超分辨率(DSR)不同于合成场景,由于真实世界中低分辨率(LR)深度图因自然退化而产生的结构失真和边缘噪声,这是一项具有挑战性的任务。这些缺陷导致深度图与RGB引导之间的显著结构不一致,可能混淆RGB结构引导,从而降低DSR质量。本文提出了一种新颖的结构流引导DSR框架,其中学习跨模态流图以指导RGB结构信息传递,实现精确的深度上采样。具体而言,我们的框架包括跨模态流引导上采样网络(CFUNet)和流增强金字塔边缘注意力网络(PEANet)。CFUNet包含一个结合了几何和语义相关性的三边自注意力模块,用于可靠的跨模态流学习。然后,将学习到的流图与网格采样机制结合,用于粗略的高分辨率(HR)深度预测。PEANet旨在将学习到的流图作为边缘注意力整合到金字塔网络中,以分层学习边缘聚焦的引导特征,用于深度边缘细化。在真实和合成DSR数据集上的大量实验验证,与最先进方法相比,我们的方法实现了优异性能。