The proliferation of pretrained models, as a result of advancements in pretraining techniques, has led to the emergence of a vast zoo of publicly available models. Effectively utilizing these resources to obtain models with robust out-of-distribution generalization capabilities for downstream tasks has become a crucial area of research. Previous research has primarily focused on identifying the most powerful models within the model zoo, neglecting to fully leverage the diverse inductive biases contained within. This paper argues that the knowledge contained in weaker models is valuable and presents a method for leveraging the diversity within the model zoo to improve out-of-distribution generalization capabilities. Specifically, we investigate the behaviors of various pretrained models across different domains of downstream tasks by characterizing the variations in their encoded representations in terms of two dimensions: diversity shift and correlation shift. This characterization enables us to propose a new algorithm for integrating diverse pretrained models, not limited to the strongest models, in order to achieve enhanced out-of-distribution generalization performance. Our proposed method demonstrates state-of-the-art empirical results on a variety of datasets, thus validating the benefits of utilizing diverse knowledge.
翻译:预训练技术的进步催生了大量公开可用的预训练模型,形成了一个庞大的模型动物园。如何有效利用这些资源,为下游任务获得具备强大分布外泛化能力的模型,已成为关键研究领域。以往的研究主要聚焦于识别模型动物园中性能最强的模型,而忽视了其中蕴含的多样化归纳偏见的充分挖掘。本文认为,较弱模型中所包含的知识同样具有价值,并提出了一种利用模型动物园多样性来提升分布外泛化能力的方法。具体而言,我们通过从多样性偏移和相关性偏移两个维度刻画不同预训练模型在下游任务各领域中编码表征的变化,系统研究了这些模型的行为特征。基于这种刻画,我们提出了一种整合多样化预训练模型的新算法,该算法不局限于最强模型,旨在实现更优的分布外泛化性能。所提方法在多个数据集上取得了最先进的实证结果,从而验证了利用多样化知识的有效性。