Synthetic aperture radar (SAR) imaging technology is commonly used to provide 24-hour all-weather earth observation. However, it still has some drawbacks in SAR target classification, especially in fine-grained classification of aircraft: aircrafts in SAR images have large intra-class diversity and inter-class similarity; the number of effective samples is insufficient and it's hard to annotate. To address these issues, this article proposes a novel multi-modal self-supervised network (MS-Net) for fine-grained classification of aircraft. Firstly, in order to entirely exploit the potential of multi-modal information, a two-sided path feature extraction network (TSFE-N) is constructed to enhance the image feature of the target and obtain the domain knowledge feature of text mode. Secondly, a contrastive self-supervised learning (CSSL) framework is employed to effectively learn useful label-independent feature from unbalanced data, a similarity per-ception loss (SPloss) is proposed to avoid network overfitting. Finally, TSFE-N is used as the encoder of CSSL to obtain the classification results. Through a large number of experiments, our MS-Net can effectively reduce the difficulty of classifying similar types of aircrafts. In the case of no label, the proposed algorithm achieves an accuracy of 88.46% for 17 types of air-craft classification task, which has pioneering significance in the field of fine-grained classification of aircraft in SAR images.
翻译:合成孔径雷达(SAR)成像技术通常用于提供全天候24小时地球观测。然而,该技术在SAR目标分类方面仍存在不足,尤其是在飞机细粒度分类中:SAR图像中的飞机存在类内差异大、类间相似度高的问题;有效样本数量不足且标注困难。针对上述问题,本文提出了一种新颖的多模态自监督网络(MS-Net),用于飞机的细粒度分类。首先,为充分挖掘多模态信息的潜力,构建了一种双通路特征提取网络(TSFE-N),用于增强目标图像特征并获取文本模态的领域知识特征。其次,采用对比自监督学习(CSSL)框架,从非平衡数据中有效学习与标签无关的有用特征,并提出了相似性感知损失(SPloss)以避免网络过拟合。最后,将TSFE-N作为CSSL的编码器以获得分类结果。通过大量实验,我们的MS-Net能够有效降低相似类型飞机的分类难度。在无标签情况下,所提算法在17类飞机分类任务中达到了88.46%的准确率,在SAR图像飞机细粒度分类领域具有开创性意义。