Foundation models encode rich representations that can be adapted to downstream tasks by fine-tuning. However, fine-tuning a model on one data distribution often degrades performance under distribution shifts. Current approaches to robust fine-tuning use hand-crafted regularization techniques to constrain the fine-tuning process towards the pretrained model. Yet, it is hard to specify how to adapt relevant characteristics of the foundation model during fine-tuning, as this depends on how the pre-training, fine-tuning, and test data distributions relate to each other. We propose AutoFT, a data-driven approach for robust fine-tuning. Given a task, AutoFT searches for a fine-tuning procedure that enhances out-of-distribution (OOD) generalization. Specifically, AutoFT uses bi-level optimization to search for an objective function and hyperparameters that maximize post-adaptation performance on a small OOD validation set. We evaluate AutoFT on nine natural distribution shifts. Our experiments show that AutoFT significantly improves generalization to OOD inputs, outperforming existing robust fine-tuning methods. Notably, AutoFT achieves a new state-of-the-art on the WILDS iWildCam and FMoW benchmarks, outperforming the previous best methods by $6.0\%$ and $1.5\%$, respectively.
翻译:基础模型编码了丰富的表征,可通过微调适配下游任务。然而,在单一数据分布上微调模型常会损害其在分布偏移下的性能。现有鲁棒微调方法采用手工设计的正则化技术约束微调过程向预训练模型靠近。但如何适配基础模型在微调过程中的相关特征难以明确指定,因为这取决于预训练、微调和测试数据分布之间的相互关联。本文提出数据驱动的鲁棒微调方法AutoFT。给定任务时,AutoFT通过搜索增强分布外泛化的微调策略。具体而言,AutoFT采用双层优化搜索能最大化小样本OOD验证集上后适应性能的目标函数与超参数。我们在九种自然分布偏移场景下评估AutoFT,实验表明该方法显著提升OOD输入的泛化能力,性能超越现有鲁棒微调方法。值得注意的是,AutoFT在WILDS iWildCam和FMoW基准测试中达到新的最先进水平,分别以6.0%和1.5%的绝对优势超越此前最优方法。