Edge detection, as a fundamental task in computer vision, has garnered increasing attention. The advent of deep learning has significantly advanced this field. However, recent deep learning-based methods generally face two significant issues: 1) reliance on large-scale pre-trained weights, and 2) generation of thick edges. We construct a U-shape encoder-decoder model named CPD-Net that successfully addresses these two issues simultaneously. In response to issue 1), we propose a novel cycle pixel difference convolution (CPDC), which effectively integrates edge prior knowledge with modern convolution operations, consequently successfully eliminating the dependence on large-scale pre-trained weights. As for issue 2), we construct a multi-scale information enhancement module (MSEM) and a dual residual connection-based (DRC) decoder to enhance the edge location ability of the model, thereby generating crisp and clean contour maps. Comprehensive experiments conducted on four standard benchmarks demonstrate that our method achieves competitive performance on the BSDS500 dataset (ODS=0.813 and AC=0.352), NYUD-V2 (ODS=0.760 and AC=0.223), BIPED dataset (ODS=0.898 and AC=0.426), and CID (ODS=0.59). Our approach provides a novel perspective for addressing these challenges in edge detection.
翻译:边缘检测作为计算机视觉中的一项基础任务,已受到越来越多的关注。深度学习的出现显著推动了该领域的发展。然而,当前基于深度学习的方法普遍面临两个重要问题:1)依赖大规模预训练权重;2)生成过粗的边缘。我们构建了一个名为CPD-Net的U型编码器-解码器模型,成功同时解决了这两个问题。针对问题1,我们提出了一种新颖的循环像素差分卷积(CPDC),该卷积有效融合了边缘先验知识与现代卷积操作,从而成功消除了对大规模预训练权重的依赖。针对问题2,我们构建了一个多尺度信息增强模块(MSEM)和一个基于双残差连接(DRC)的解码器,以增强模型的边缘定位能力,从而生成清晰干净的轮廓图。在四个标准基准数据集上的综合实验表明,我们的方法在BSDS500数据集(ODS=0.813,AC=0.352)、NYUD-V2(ODS=0.760,AC=0.223)、BIPED数据集(ODS=0.898,AC=0.426)和CID(ODS=0.59)上均取得了具有竞争力的性能。我们的方法为解决边缘检测中的这些挑战提供了一个新的视角。