Data collected by different modalities can provide a wealth of complementary information, such as hyperspectral image (HSI) to offer rich spectral-spatial properties, synthetic aperture radar (SAR) to provide structural information about the Earth's surface, and light detection and ranging (LiDAR) to cover altitude information about ground elevation. Therefore, a natural idea is to combine multimodal images for refined and accurate land-cover interpretation. Although many efforts have been attempted to achieve multi-source remote sensing image classification, there are still three issues as follows: 1) indiscriminate feature representation without sufficiently considering modal heterogeneity, 2) abundant features and complex computations associated with modeling long-range dependencies, and 3) overfitting phenomenon caused by sparsely labeled samples. To overcome the above barriers, a transformer-based heterogeneously salient graph representation (THSGR) approach is proposed in this paper. First, a multimodal heterogeneous graph encoder is presented to encode distinctively non-Euclidean structural features from heterogeneous data. Then, a self-attention-free multi-convolutional modulator is designed for effective and efficient long-term dependency modeling. Finally, a mean forward is put forward in order to avoid overfitting. Based on the above structures, the proposed model is able to break through modal gaps to obtain differentiated graph representation with competitive time cost, even for a small fraction of training samples. Experiments and analyses on three benchmark datasets with various state-of-the-art (SOTA) methods show the performance of the proposed approach.
翻译:不同模态采集的数据能够提供丰富的互补信息,例如高光谱图像提供光谱-空间特性,合成孔径雷达提供地表结构信息,激光雷达提供高程信息。因此,融合多模态图像实现精细化土地覆盖解译成为自然思路。尽管已有诸多研究致力于多源遥感图像分类,但仍存在三个问题:1)未充分考虑模态异质性而导致特征表示无区分性;2)长程依赖建模带来的特征冗余与计算复杂度;3)稀疏标注样本引发的过拟合现象。为克服上述障碍,本文提出基于Transformer的异质显著图表示方法。首先,设计多模态异质图编码器,从异质数据中编码独特的非欧几里得结构特征;其次,构建无自注意力的多卷积调制器以实现高效长程依赖建模;最后,提出均值前向策略避免过拟合。基于上述结构,所提模型能够突破模态差异获取差异化图表示,即使仅使用少量训练样本仍具有竞争力的时间开销。在三个基准数据集上与多种最先进方法的实验分析表明,本方法性能优越。