Limited by the encoder-decoder architecture, learning-based edge detectors usually have difficulty predicting edge maps that satisfy both correctness and crispness. With the recent success of the diffusion probabilistic model (DPM), we found it is especially suitable for accurate and crisp edge detection since the denoising process is directly applied to the original image size. Therefore, we propose the first diffusion model for the task of general edge detection, which we call DiffusionEdge. To avoid expensive computational resources while retaining the final performance, we apply DPM in the latent space and enable the classic cross-entropy loss which is uncertainty-aware in pixel level to directly optimize the parameters in latent space in a distillation manner. We also adopt a decoupled architecture to speed up the denoising process and propose a corresponding adaptive Fourier filter to adjust the latent features of specific frequencies. With all the technical designs, DiffusionEdge can be stably trained with limited resources, predicting crisp and accurate edge maps with much fewer augmentation strategies. Extensive experiments on four edge detection benchmarks demonstrate the superiority of DiffusionEdge both in correctness and crispness. On the NYUDv2 dataset, compared to the second best, we increase the ODS, OIS (without post-processing) and AC by 30.2%, 28.1% and 65.1%, respectively. Code: https://github.com/GuHuangAI/DiffusionEdge.
翻译:受限于编码器-解码器架构,基于学习的边缘检测器通常难以同时保证预测边缘图的准确性与清晰度。针对扩散概率模型(DPM)的最新进展,我们发现其去噪过程直接作用于原始图像尺寸的特性,特别适用于实现精确且清晰的边缘检测。为此,我们提出首个面向通用边缘检测任务的扩散模型——DiffusionEdge。为在保持最终性能的同时避免高昂计算开销,我们在潜在空间中应用DPM,并采用蒸馏方式使经典的交叉熵损失(具有像素级不确定性感知能力)直接优化潜在空间参数。同时,我们引入解耦架构加速去噪过程,并提出对应的自适应傅里叶滤波器来调整特定频率的潜在特征。通过上述技术设计,DiffusionEdge可在有限资源下稳定训练,利用更少的增强策略即可预测出清晰准确的边缘图。在四个边缘检测基准上的大量实验表明,DiffusionEdge在准确性与清晰度方面均具有优越性。在NYUDv2数据集上,与次优方法相比,我们将ODS、OIS(无后处理)和AC指标分别提升了30.2%、28.1%和65.1%。代码地址:https://github.com/GuHuangAI/DiffusionEdge。