Feature alignment is the primary means of fusing multimodal data. We propose a feature alignment method that fully fuses multimodal information, which alternately shifts and expands feature information from different modalities to have a consistent representation in a feature space. The proposed method can robustly capture high-level interactions between features of different modalities, thus significantly improving the performance of multimodal learning. We also show that the proposed method outperforms other popular multimodal schemes on multiple tasks. Experimental evaluation of ETT and MIT-BIH-Arrhythmia, datasets shows that the proposed method achieves state of the art performance.
翻译:特征对齐是融合多模态数据的主要手段。我们提出了一种充分融合多模态信息的特征对齐方法,该方法通过交替移动和扩展来自不同模态的特征信息,使其在特征空间中具有一致的表示。所提出的方法能够稳健地捕捉不同模态特征之间的高阶交互作用,从而显著提升多模态学习的性能。我们还证明,所提方法在多项任务上优于其他流行的多模态方案。在ETT和MIT-BIH-Arrhythmia数据集上的实验评估表明,该方法达到了最先进的性能。