TriangleNet: Edge Prior Augmented Network for Semantic Segmentation through Cross-Task Consistency

This paper addresses the task of semantic segmentation in computer vision, aiming to achieve precise pixel-wise classification. We investigate the joint training of models for semantic edge detection and semantic segmentation, which has shown promise. However, implicit cross-task consistency learning in multi-task networks is limited. To address this, we propose a novel "decoupled cross-task consistency loss" that explicitly enhances cross-task consistency. Our semantic segmentation network, TriangleNet, achieves a substantial 2.88\% improvement over the Baseline in mean Intersection over Union (mIoU) on the Cityscapes test set. Notably, TriangleNet operates at 77.4\% mIoU/46.2 FPS on Cityscapes, showcasing real-time inference capabilities at full resolution. With multi-scale inference, performance is further enhanced to 77.8\%. Furthermore, TriangleNet consistently outperforms the Baseline on the FloodNet dataset, demonstrating its robust generalization capabilities. The proposed method underscores the significance of multi-task learning and explicit cross-task consistency enhancement for advancing semantic segmentation and highlights the potential of multitasking in real-time semantic segmentation.

翻译：本文研究计算机视觉中的语义分割任务，旨在实现精确的逐像素分类。我们探讨了语义边缘检测与语义分割模型的联合训练方法，该方法已展现出潜力。然而，多任务网络中隐式的跨任务一致性学习存在局限性。为解决此问题，我们提出一种新颖的“解耦跨任务一致性损失”，以显式增强跨任务一致性。我们的语义分割网络TriangleNet在Cityscapes测试集上的平均交并比（mIoU）相较于基线方法提升了2.88%。尤为重要的是，TriangleNet在Cityscapes数据集上以全分辨率实现了77.4% mIoU/46.2 FPS的实时推理性能。采用多尺度推理后，性能进一步提升至77.8%。此外，TriangleNet在FloodNet数据集上始终优于基线方法，展示了其强大的泛化能力。所提方法强调了多任务学习与显式跨任务一致性增强对推进语义分割的重要性，并揭示了多任务学习在实时语义分割中的应用潜力。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日