Foundation models are redefining how AI systems are built. Practitioners now follow a standard procedure to build their machine learning solutions: from a pre-trained foundation model, they fine-tune the weights on the target task of interest. So, the Internet is swarmed by a handful of foundation models fine-tuned on many diverse tasks: these individual fine-tunings exist in isolation without benefiting from each other. In our opinion, this is a missed opportunity, as these specialized models contain rich and diverse features. In this paper, we thus propose model ratatouille, a new strategy to recycle the multiple fine-tunings of the same foundation model on diverse auxiliary tasks. Specifically, we repurpose these auxiliary weights as initializations for multiple parallel fine-tunings on the target task; then, we average all fine-tuned weights to obtain the final model. This recycling strategy aims at maximizing the diversity in weights by leveraging the diversity in auxiliary tasks. Empirically, it improves the state of the art on the reference DomainBed benchmark for out-of-distribution generalization. Looking forward, this work contributes to the emerging paradigm of updatable machine learning where, akin to open-source software development, the community collaborates to reliably update machine learning models.
翻译:基础模型正在重新定义人工智能系统的构建方式。从业者目前遵循标准流程构建机器学习解决方案:从预训练的基础模型出发,针对目标任务微调模型权重。因此,互联网上充斥着针对各种不同任务微调的基础模型——这些独立的微调模型各自孤立存在,未能相互受益。我们认为这错失了良机,因为这些专业化模型蕴含丰富多样的特征。为此,本文提出"模型大杂烩"策略——一种回收同一基础模型在不同辅助任务上多次微调结果的新方法。具体而言,我们将这些辅助权重重新用作目标任务上多个并行微调的初始化参数,随后对所有微调权重取平均以得到最终模型。该回收策略旨在通过利用辅助任务的多样性来最大化权重的多样性。实验表明,该方法在面向分布外泛化的参考基准DomainBed上提升了当前最优性能。展望未来,这项工作为可更新机器学习这一新兴范式做出贡献——类似于开源软件开发,社区协作以可靠地更新机器学习模型。