We introduce a novel bilateral reference framework (BiRefNet) for high-resolution dichotomous image segmentation (DIS). It comprises two essential components: the localization module (LM) and the reconstruction module (RM) with our proposed bilateral reference (BiRef). The LM aids in object localization using global semantic information. Within the RM, we utilize BiRef for the reconstruction process, where hierarchical patches of images provide the source reference and gradient maps serve as the target reference. These components collaborate to generate the final predicted maps. We also introduce auxiliary gradient supervision to enhance focus on regions with finer details. Furthermore, we outline practical training strategies tailored for DIS to improve map quality and training process. To validate the general applicability of our approach, we conduct extensive experiments on four tasks to evince that BiRefNet exhibits remarkable performance, outperforming task-specific cutting-edge methods across all benchmarks. Our codes are available at https://github.com/ZhengPeng7/BiRefNet.
翻译:我们提出了一种新颖的双边参考框架(BiRefNet),用于高分辨率二值图像分割(DIS)。该框架包含两个核心模块:定位模块(LM)和重建模块(RM),其中重建模块集成了我们首创的双边参考(BiRef)机制。LM利用全局语义信息辅助目标定位,而RM则运用BiRef实现重建过程——层次化图像块作为源参考,梯度图作为目标参考。二者协同生成最终预测图。同时,我们引入辅助梯度监督以增强对精细细节区域的关注。此外,我们针对DIS设计了一套实用训练策略,以优化图像质量与训练流程。为验证方法的通用性,我们在四项任务上开展广泛实验,证明BiRefNet在所有基准测试中均展现出卓越性能,全面超越任务特定的前沿方法。相关代码已开源至https://github.com/ZhengPeng7/BiRefNet。