Modeling Multi-Granularity Context Information Flow for Pavement Crack Detection

Crack detection has become an indispensable, interesting yet challenging task in the computer vision community. Specially, pavement cracks have a highly complex spatial structure, a low contrasting background and a weak spatial continuity, posing a significant challenge to an effective crack detection method. In this paper, we address these problems from a view that utilizes contexts of the cracks and propose an end-to-end deep learning method to model the context information flow. To precisely localize crack from an image, it is critical to effectively extract and aggregate multi-granularity context, including the fine-grained local context around the cracks (in spatial-level) and the coarse-grained semantics (in segment-level). Concretely, in Convolutional Neural Network (CNN), low-level features extracted by the shallow layers represent the local information, while the deep layers extract the semantic features. Additionally, a second main insight in this work is that the semantic context should be an guidance to local context feature. By the above insights, the proposed method we first apply the dilated convolution as the backbone feature extractor to model local context, then we build a context guidance module to leverage semantic context to guide local feature extraction at multiple stages. To handle label alignment between stages, we apply the Multiple Instance Learning (MIL) strategy to align the high-level feature to the low-level ones in the stage-wise context flow. In addition, compared with these public crack datasets, to our best knowledge, we release the largest, most complex and most challenging Bitumen Pavement Crack (BPC) dataset. The experimental results on the three crack datasets demonstrate that the proposed method performs well and outperforms the current state-of-the-art methods.

翻译：裂缝检测已成为计算机视觉领域中一个不可或缺、有趣且具有挑战性的任务。特别地，路面裂缝具有高度复杂的空间结构、低对比度背景以及弱空间连续性，这给有效的裂缝检测方法带来了重大挑战。本文从利用裂缝上下文的视角出发，针对这些问题提出了一种端到端的深度学习方法，用于建模上下文信息流。为了从图像中精确定位裂缝，关键在于有效提取并聚合多粒度上下文，包括裂缝周围的细粒度局部上下文（空间层级）和粗粒度语义（段层级）。具体而言，在卷积神经网络中，浅层提取的低级特征表征局部信息，而深层则提取语义特征。此外，本文的第二个主要洞见是，语义上下文应为局部上下文特征提供指导。基于上述洞见，所提出的方法首先采用空洞卷积作为骨干特征提取器来建模局部上下文，然后构建一个上下文引导模块，利用语义上下文在多阶段指导局部特征提取。为解决阶段间标签对齐问题，我们应用多示例学习策略，在逐阶段上下文流中将高级特征与低级特征对齐。此外，与现有公开裂缝数据集相比，据我们所知，我们发布了规模最大、最复杂且最具挑战性的沥青路面裂缝数据集。在三个裂缝数据集上的实验结果表明，所提出的方法表现优异，并超越了当前最先进的方法。