Cross-domain few-shot learning (CDFSL) aims to acquire knowledge from limited training data in the target domain by leveraging prior knowledge transferred from source domains with abundant training samples. CDFSL faces challenges in transferring knowledge across dissimilar domains and fine-tuning models with limited training data. To address these challenges, we initially extend the analysis of loss landscapes from the parameter space to the representation space, which allows us to simultaneously interpret the transferring and fine-tuning difficulties of CDFSL models. We observe that sharp minima in the loss landscapes of the representation space result in representations that are hard to transfer and fine-tune. Moreover, existing flatness-based methods have limited generalization ability due to their short-range flatness. To enhance the transferability and facilitate fine-tuning, we introduce a simple yet effective approach to achieve long-range flattening of the minima in the loss landscape. This approach considers representations that are differently normalized as minima in the loss landscape and flattens the high-loss region in the middle by randomly sampling interpolated representations. We implement this method as a new normalization layer that replaces the original one in both CNNs and ViTs. This layer is simple and lightweight, introducing only a minimal number of additional parameters. Experimental results on 8 datasets demonstrate that our approach outperforms state-of-the-art methods in terms of average accuracy. Moreover, our method achieves performance improvements of up to 9\% compared to the current best approaches on individual datasets. Our code will be released.
翻译:跨域小样本学习(CDFSL)旨在通过利用从具有丰富训练样本的源域迁移的先验知识,从目标域有限的训练数据中获取知识。CDFSL在跨不相似域迁移知识以及使用有限训练数据微调模型时面临挑战。为应对这些挑战,我们首先将损失景观的分析从参数空间扩展到表示空间,从而能够同步解释CDFSL模型在迁移和微调中的困难。我们观察到,表示空间损失景观中的尖锐最小值会导致难以迁移和微调的表示。此外,现有的基于平坦性的方法由于其短程平坦性而泛化能力有限。为增强可迁移性并促进微调,我们提出了一种简单有效的方法,以实现损失景观中最小值的远程平坦化。该方法将不同归一化的表示视为损失景观中的最小值,并通过随机采样插值表示来平坦中间的损失较高区域。我们将此方法实现为一种新的归一化层,以替换CNN和ViT中的原始归一化层。该层简单轻量,仅引入极少数额外参数。在8个数据集上的实验结果表明,我们的方法在平均准确率上超越了现有最先进方法。此外,在个别数据集上,我们的方法相比当前最优方法实现了高达9%的性能提升。我们的代码将公开发布。