Fine-tuning foundation models often compromises their robustness to distribution shifts. To remedy this, most robust fine-tuning methods aim to preserve the pre-trained features. However, not all pre-trained features are robust and those methods are largely indifferent to which ones to preserve. We propose dual risk minimization (DRM), which combines empirical risk minimization with worst-case risk minimization, to better preserve the core features of downstream tasks. In particular, we utilize core-feature descriptions generated by LLMs to induce core-based zero-shot predictions which then serve as proxies to estimate the worst-case risk. DRM balances two crucial aspects of model robustness: expected performance and worst-case performance, establishing a new state of the art on various real-world benchmarks. DRM significantly improves the out-of-distribution performance of CLIP ViT-L/14@336 on ImageNet (75.9 to 77.1), WILDS-iWildCam (47.1 to 51.8), and WILDS-FMoW (50.7 to 53.1); opening up new avenues for robust fine-tuning. Our code is available at https://github.com/vaynexie/DRM .
翻译:微调基础模型通常会削弱其对分布偏移的鲁棒性。为解决此问题,大多数鲁棒微调方法旨在保留预训练特征。然而,并非所有预训练特征都是鲁棒的,且这些方法在很大程度上对保留哪些特征并不敏感。我们提出双重风险最小化(DRM),该方法将经验风险最小化与最坏情况风险最小化相结合,以更好地保留下游任务的核心特征。具体而言,我们利用大语言模型生成的核心特征描述来引导基于核心的零样本预测,这些预测随后作为估计最坏情况风险的代理。DRM平衡了模型鲁棒性的两个关键方面:期望性能与最坏情况性能,从而在各种现实世界基准测试中确立了新的最优水平。DRM显著提升了CLIP ViT-L/14@336在ImageNet(75.9至77.1)、WILDS-iWildCam(47.1至51.8)和WILDS-FMoW(50.7至53.1)上的分布外性能;为鲁棒微调开辟了新途径。我们的代码可在https://github.com/vaynexie/DRM获取。