As large-scale training regimes have gained popularity, the use of pretrained models for downstream tasks has become common practice in machine learning. While pretraining has been shown to enhance the performance of models in practice, the transfer of robustness properties from pretraining to downstream tasks remains poorly understood. In this study, we demonstrate that the robustness of a linear predictor on downstream tasks can be constrained by the robustness of its underlying representation, regardless of the protocol used for pretraining. We prove (i) a bound on the loss that holds independent of any downstream task, as well as (ii) a criterion for robust classification in particular. We validate our theoretical results in practical applications, show how our results can be used for calibrating expectations of downstream robustness, and when our results are useful for optimal transfer learning. Taken together, our results offer an initial step towards characterizing the requirements of the representation function for reliable post-adaptation performance.
翻译:随着大规模训练范式的普及,使用预训练模型处理下游任务已成为机器学习中的常见做法。尽管预训练在实践中已被证明能提升模型性能,但鲁棒性从预训练到下游任务的迁移机制仍缺乏深入理解。本研究表明,线性预测器在下游任务上的鲁棒性受其底层表示鲁棒性的约束,且该结论独立于预训练协议。我们证明了:(i) 与任意下游任务无关的损失界,以及 (ii) 专门针对鲁棒分类的判别准则。我们在实际应用中验证了理论结果,展示了如何利用这些结果校准下游鲁棒性的预期值,并指出了这些结果在最优迁移学习中的适用场景。综合而言,我们的研究为描述表示函数在确保可靠后适应性能方面所需的条件提供了初步探索。