Foundation models are redefining how AI systems are built. Practitioners now follow a standard procedure to build their machine learning solutions: from a pre-trained foundation model, they fine-tune the weights on the target task of interest. So, the Internet is swarmed by a handful of foundation models fine-tuned on many diverse tasks: these individual fine-tunings exist in isolation without benefiting from each other. In our opinion, this is a missed opportunity, as these specialized models contain rich and diverse features. In this paper, we thus propose model ratatouille, a new strategy to recycle the multiple fine-tunings of the same foundation model on diverse auxiliary tasks. Specifically, we repurpose these auxiliary weights as initializations for multiple parallel fine-tunings on the target task; then, we average all fine-tuned weights to obtain the final model. This recycling strategy aims at maximizing the diversity in weights by leveraging the diversity in auxiliary tasks. Empirically, it improves the state of the art on the reference DomainBed benchmark for out-of-distribution generalization. Looking forward, this work contributes to the emerging paradigm of updatable machine learning where, akin to open-source software development, the community collaborates to reliably update machine learning models. Our code is released: https://github.com/facebookresearch/ModelRatatouille.
翻译:基础模型正在重新定义人工智能系统的构建方式。实践者如今遵循标准流程构建机器学习解决方案:从预训练的基础模型出发,针对目标任务微调其权重。因此,互联网上充斥着针对大量多样化任务进行微调的基础模型——这些独立的微调模型相互孤立,未能实现优势互补。我们认为这错失了良机,因为这些专业化模型蕴含丰富多样的特征。本文提出"模型杂烩"策略,创新性地回收同一基础模型在不同辅助任务上的多次微调成果。具体而言,我们将这些辅助模型权重重新用作目标任务的多条并行微调初始化,再对所有微调权重求平均以获得最终模型。该回收策略旨在通过利用辅助任务的多样性最大化权重多样性。实验表明,它在分布外泛化基准DomainBed上提升了现有最优水平。展望未来,这项工作为新兴的可更新机器学习范式做出贡献——类似于开源软件开发,社区协作可靠地更新机器学习模型。代码已开源:https://github.com/facebookresearch/ModelRatatouille。