Model Ratatouille: Recycling Diverse Models for Out-of-Distribution Generalization

Foundation models are redefining how AI systems are built. Practitioners now follow a standard procedure to build their machine learning solutions: from a pre-trained foundation model, they fine-tune the weights on the target task of interest. So, the Internet is swarmed by a handful of foundation models fine-tuned on many diverse tasks: these individual fine-tunings exist in isolation without benefiting from each other. In our opinion, this is a missed opportunity, as these specialized models contain rich and diverse features. In this paper, we thus propose model ratatouille, a new strategy to recycle the multiple fine-tunings of the same foundation model on diverse auxiliary tasks. Specifically, we repurpose these auxiliary weights as initializations for multiple parallel fine-tunings on the target task; then, we average all fine-tuned weights to obtain the final model. This recycling strategy aims at maximizing the diversity in weights by leveraging the diversity in auxiliary tasks. Empirically, it improves the state of the art on the reference DomainBed benchmark for out-of-distribution generalization. Looking forward, this work contributes to the emerging paradigm of updatable machine learning where, akin to open-source software development, the community collaborates to reliably update machine learning models. Our code is released: https://github.com/facebookresearch/ModelRatatouille.

翻译：基础模型正在重新定义人工智能系统的构建方式。实践者如今遵循标准流程构建机器学习解决方案：从预训练的基础模型出发，针对目标任务微调其权重。因此，互联网上充斥着针对大量多样化任务进行微调的基础模型——这些独立的微调模型相互孤立，未能实现优势互补。我们认为这错失了良机，因为这些专业化模型蕴含丰富多样的特征。本文提出"模型杂烩"策略，创新性地回收同一基础模型在不同辅助任务上的多次微调成果。具体而言，我们将这些辅助模型权重重新用作目标任务的多条并行微调初始化，再对所有微调权重求平均以获得最终模型。该回收策略旨在通过利用辅助任务的多样性最大化权重多样性。实验表明，它在分布外泛化基准DomainBed上提升了现有最优水平。展望未来，这项工作为新兴的可更新机器学习范式做出贡献——类似于开源软件开发，社区协作可靠地更新机器学习模型。代码已开源：https://github.com/facebookresearch/ModelRatatouille。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/