Efficiently Robustify Pre-trained Models

A recent trend in deep learning algorithms has been towards training large scale models, having high parameter count and trained on big dataset. However, robustness of such large scale models towards real-world settings is still a less-explored topic. In this work, we first benchmark the performance of these models under different perturbations and datasets thereby representing real-world shifts, and highlight their degrading performance under these shifts. We then discuss on how complete model fine-tuning based existing robustification schemes might not be a scalable option given very large scale networks and can also lead them to forget some of the desired characterstics. Finally, we propose a simple and cost-effective method to solve this problem, inspired by knowledge transfer literature. It involves robustifying smaller models, at a lower computation cost, and then use them as teachers to tune a fraction of these large scale networks, reducing the overall computational overhead. We evaluate our proposed method under various vision perturbations including ImageNet-C,R,S,A datasets and also for transfer learning, zero-shot evaluation setups on different datasets. Benchmark results show that our method is able to induce robustness to these large scale models efficiently, requiring significantly lower time and also preserves the transfer learning, zero-shot properties of the original model which none of the existing methods are able to achieve.

翻译：近期深度学习算法的发展趋势是训练大规模模型，这类模型参数数量庞大且基于大数据集训练而成。然而，此类大规模模型在真实场景中的鲁棒性仍是一个较少被探索的课题。本研究首先基于不同扰动和数据集（代表真实世界的分布偏移）对这些模型的性能进行基准测试，揭示了它们在面对这些偏移时性能下降的现象。随后，我们讨论了现有基于全模型微调的鲁棒化方案可能并非可扩展的选择——鉴于网络规模极其庞大，这类方案不仅计算成本高昂，还可能使模型遗忘某些理想特征。最后，受知识迁移文献启发，我们提出了一种简单且经济高效的方法来解决该问题。该方法先以较低计算成本鲁棒化小型模型，再将其作为教师模型对大规模网络的部分参数进行微调，从而降低整体计算开销。我们在多种视觉扰动场景（包括ImageNet-C、R、S、A数据集）以及迁移学习、零样本评估设置上对提出的方法进行了评测。基准测试结果表明，我们的方法能够高效地为大规模模型赋予鲁棒性，显著缩短训练时间，同时保留原始模型的迁移学习与零样本特性——这是现有方法均无法实现的。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/