We propose a novel multi-stage depth super-resolution network, which progressively reconstructs high-resolution depth maps from explicit and implicit high-frequency features. The former are extracted by an efficient transformer processing both local and global contexts, while the latter are obtained by projecting color images into the frequency domain. Both are combined together with depth features by means of a fusion strategy within a multi-stage and multi-scale framework. Experiments on the main benchmarks, such as NYUv2, Middlebury, DIML and RGBDD, show that our approach outperforms existing methods by a large margin (~20% on NYUv2 and DIML against the contemporary work DADA, with 16x upsampling), establishing a new state-of-the-art in the guided depth super-resolution task.
翻译:我们提出了一种新颖的多阶段深度超分辨率网络,通过渐进式地从显式和隐式高频特征中重建高分辨率深度图。显式高频特征由一种同时处理局部与全局上下文的高效Transformer提取,而隐式高频特征则通过将彩色图像投影至频域获得。通过多阶段多尺度框架中的融合策略,将上述两类特征与深度特征相结合。在NYUv2、Middlebury、DIML及RGBDD等主流基准数据集上的实验表明,我们的方法以显著优势超越现有方法(在16倍上采样条件下,针对NYUv2和DIML数据集,与同期工作DADA相比性能提升约20%),从而在引导式深度超分辨率任务中建立了新的最佳性能水平。