Power lines pose a significant safety threat to unmanned aerial vehicles (UAVs) operating at low altitudes. However, detecting power lines in aerial images is challenging due to the small size of the foreground data (i.e., power lines) and the abundance of background information. To address this challenge, we propose DUFormer, a semantic segmentation algorithm designed specifically for power line detection in aerial images. We assume that performing sufficient feature extraction with a convolutional neural network (CNN) that has a strong inductive bias is beneficial for training an efficient Transformer model. To this end, we propose a heavy token encoder responsible for overlapping feature re-mining and tokenization. The encoder comprises a pyramid CNN feature extraction module and a power line feature enhancement module. Following sufficient feature extraction for power lines, the feature fusion is carried out, and then the Transformer block is used for global modeling. The final segmentation result is obtained by fusing local and global features in the decode head. Additionally, we demonstrate the significance of the joint multi-weight loss function in power line segmentation. The experimental results demonstrate that our proposed method achieves the state-of-the-art performance in power line segmentation on the publicly available TTPLA dataset.
翻译:电力线对低空飞行的无人机(UAV)构成重大安全威胁。然而,由于前景数据(即电力线)尺寸细小且背景信息丰富,在航拍图像中检测电力线极具挑战性。针对该问题,本文提出DUFormer——一种专为航拍图像电力线检测设计的语义分割算法。我们假设:使用具有强归纳偏置的卷积神经网络(CNN)进行充分特征提取,有助于训练高效Transformer模型。为此,我们设计了重载令牌编码器,负责重叠特征重新挖掘与令牌化。该编码器包含金字塔CNN特征提取模块和电力线特征增强模块。在对电力线完成充分特征提取后,进行特征融合,进而采用Transformer模块进行全局建模。最终通过解码头融合局部与全局特征得到分割结果。此外,我们论证了联合多权重损失函数在电力线分割中的重要性。实验结果表明,所提方法在公开TTPLA数据集上的电力线分割任务中取得了最佳性能。