Fine-tuning large-scale pre-trained language models has been demonstrated effective for various natural language processing (NLP) tasks. Previous studies have established that incorporating adversarial training during the fine-tuning stage can significantly enhance model generalization and robustness. However, from the perspective of game theory, such utilizations of adversarial training correspond to pure-strategy games, which are inherently limited in terms of the scope of their strategies, thereby still having room for improvement. In order to push the performance boundaries, we propose a novel Mixed-strategy Adversarial Training algorithm (MAT). Methodologically, we derive the Nash equilibrium of a mixed-strategy game for adversarial training using Entropy Mirror Descent to establish MAT by sampling method. To verify the effectiveness of MAT, we conducted extensive benchmark experiments on large-scale pre-trained models, such as BERT and RoBERTa. MAT significantly outperforms the state-of-the-art methods on both the GLUE and ANLI benchmarks in terms of generalization and robustness.
翻译:微调大规模预训练语言模型已被证明能有效解决各类自然语言处理(NLP)任务。已有研究表明,在微调阶段引入对抗训练可显著提升模型的泛化能力与鲁棒性。然而,从博弈论视角来看,此类对抗训练的应用对应纯策略博弈,其策略空间存在固有局限性,因而仍有改进空间。为突破性能边界,我们提出了一种新型混合策略对抗训练算法(MAT)。在方法论层面,我们通过熵镜像下降法推导混合策略博弈的纳什均衡,并基于采样方法构建MAT。为验证MAT的有效性,我们在BERT及RoBERTa等大规模预训练模型上开展了广泛的基准实验。实验表明,MAT在GLUE与ANLI基准测试中的泛化性能和鲁棒性均显著超越当前最先进方法。