We introduce a novel bilateral reference framework (BiRefNet) for high-resolution dichotomous image segmentation (DIS). It comprises two essential components: the localization module (LM) and the reconstruction module (RM) with our proposed bilateral reference (BiRef). The LM aids in object localization using global semantic information. Within the RM, we utilize BiRef for the reconstruction process, where hierarchical patches of images provide the source reference and gradient maps serve as the target reference. These components collaborate to generate the final predicted maps. We also introduce auxiliary gradient supervision to enhance focus on regions with finer details. Furthermore, we outline practical training strategies tailored for DIS to improve map quality and training process. To validate the general applicability of our approach, we conduct extensive experiments on four tasks to evince that BiRefNet exhibits remarkable performance, outperforming task-specific cutting-edge methods across all benchmarks. Our codes are available at https://github.com/ZhengPeng7/BiRefNet.
翻译:我们提出了一种新颖的双边参考框架(BiRefNet),用于高分辨率二分图像分割(DIS)。该框架包含两个核心模块:定位模块(LM)和结合所提出的双边参考(BiRef)的重建模块(RM)。LM利用全局语义信息辅助目标定位。在RM中,我们采用BiRef进行重建过程,其中图像的分层补丁提供源参考,而梯度图作为目标参考。这些组件协同生成最终预测图。同时,我们引入了辅助梯度监督机制,以增强对精细细节区域的关注。此外,我们针对DIS任务设计了一套实用的训练策略,以提升图像质量并优化训练过程。为验证方法的通用性,我们在四项任务上开展了广泛实验,结果表明BiRefNet展现出卓越性能,在所有基准测试中均超越了任务专用最先进方法。我们的代码已开源至 https://github.com/ZhengPeng7/BiRefNet。