Weak objects are common in images and videos of space applications. However, it is hard to learn proper representations from their limited appearance information. Inspired by multi-view learning, we develop simple multi-view attentions, treating their outputs as multi-view features. We also propose a multi-view feature high-order fusion method (MHF) to aggregate more accurate and richer features of weak objects. Our MHF extends the commonly used low-order feature fusion method to higher orders. It enhances the model's capacity to capture relevant and complementary information about weak objects. This is achieved by introducing high-order multi-view features perception and a recursive task-contribution gated selection of multi-view features. The new operation is highly flexible and customizable. It is compatible with various variants of multi-view feature representations. We conduct extensive experiments on two newly constructed space science datasets and an open, large-scale satellite video dataset. Our MHF serves as a plug-and-play module and significantly improves various vision transformers and convolution-based detection and segmentation models. We achieve all state-of-the-art accuracies on both tasks across three datasets. Our MHF can be a new basic module for visual modeling that effectively represents weak objects in terms of multi-view learning. The code will be available at https://github.com/Kingdroper/MHF.
翻译:弱小目标在空间应用的图像与视频中普遍存在。然而,其有限的外观信息使得学习有效表征极为困难。受多视角学习启发,本文设计了简洁的多视角注意力模块,并将其输出视为多视角特征。同时,提出多视角特征高阶融合方法(MHF)以聚合更精确、更丰富的弱小目标特征。该方法将常见的低阶特征融合拓展至高阶,通过引入高阶多视角特征感知机制与递归任务贡献门控多视角特征选择策略,增强了模型捕捉弱小目标相关性与互补性信息的能力。所提出的运算具有高度灵活性与可定制性,兼容多种多视角特征表示变体。我们在两个新构建的空间科学数据集及一个公开的大规模卫星视频数据集上开展了充分实验。作为即插即用模块,MHF显著提升了各类视觉Transformer及卷积检测分割模型的性能,在三个数据集的两类任务中均取得了当前最优精度。MHF可作为视觉建模的新型基础模块,通过多视角学习有效表征弱小目标。代码将发布于https://github.com/Kingdroper/MHF。