In many binary segmentation tasks, most CNNs-based methods use a U-shape encoder-decoder network as their basic structure. They ignore two key problems when the encoder exchanges information with the decoder: one is the lack of interference control mechanism between them, the other is without considering the disparity of the contributions from different encoder levels. In this work, we propose a simple yet general gated network (GateNet) to tackle them all at once. With the help of multi-level gate units, the valuable context information from the encoder can be selectively transmitted to the decoder. In addition, we design a gated dual branch structure to build the cooperation among the features of different levels and improve the discrimination ability of the network. Furthermore, we introduce a "Fold" operation to improve the atrous convolution and form a novel folded atrous convolution, which can be flexibly embedded in ASPP or DenseASPP to accurately localize foreground objects of various scales. GateNet can be easily generalized to many binary segmentation tasks, including general and specific object segmentation and multi-modal segmentation. Without bells and whistles, our network consistently performs favorably against the state-of-the-art methods under 10 metrics on 33 datasets of 10 binary segmentation tasks.
翻译:在许多二值分割任务中,大多数基于CNN的方法采用U型编码器-解码器网络作为基础结构。这类方法在处理编码器与解码器信息交换时忽略了两个关键问题:一是缺乏两者间的干扰控制机制,二是未考虑不同编码器层级贡献的差异性。本文提出一种简便而通用的门控网络(GateNet)同时解决这些问题。通过多层级门控单元,编码器中的有效上下文信息可选择性地传递至解码器。此外,我们设计了一种门控双分支结构以构建不同层级特征间的协作,并提升网络的判别能力。进一步地,我们引入"折叠"运算改进空洞卷积,形成新型折叠空洞卷积,该结构可灵活嵌入ASPP或DenseASPP中,实现对多尺度前景目标的精确定位。GateNet可便捷地推广至多种二值分割任务,包括通用/特定目标分割及多模态分割。无需额外修饰,本网络在10个二值分割任务的33个数据集上,基于10项评估指标持续优于现有最优方法。