The widespread use of pre-trained language models (PLMs) in natural language processing (NLP) has greatly improved performance outcomes. However, these models' vulnerability to adversarial attacks (e.g., camouflaged hints from drug dealers), particularly in the Chinese language with its rich character diversity/variation and complex structures, hatches vital apprehension. In this study, we propose a novel method, CHinese vAriatioN Graph Enhancement (CHANGE), to increase the robustness of PLMs against character variation attacks in Chinese content. CHANGE presents a novel approach for incorporating a Chinese character variation graph into the PLMs. Through designing different supplementary tasks utilizing the graph structure, CHANGE essentially enhances PLMs' interpretation of adversarially manipulated text. Experiments conducted in a multitude of NLP tasks show that CHANGE outperforms current language models in combating against adversarial attacks and serves as a valuable contribution to robust language model research. These findings contribute to the groundwork on robust language models and highlight the substantial potential of graph-guided pre-training strategies for real-world applications.
翻译:预训练语言模型(PLMs)在自然语言处理(NLP)中的广泛应用显著提升了性能表现。然而,这些模型在面对对抗攻击(例如来自毒贩的伪装线索)时的脆弱性,尤其是在汉字多样性/变异性丰富且结构复杂的汉语语境中,引发了重大担忧。在本研究中,我们提出了一种新颖方法——中文变异图增强(CHANGE),以提升PLMs针对中文内容中字符变异攻击的鲁棒性。CHANGE提出了一种将汉字变异图纳入PLMs的创新途径。通过利用图结构设计不同的辅助任务,CHANGE从根本上增强了PLMs对抗性篡改文本的解读能力。在多种NLP任务中开展的实验表明,CHANGE在抵御对抗攻击方面优于现有语言模型,为鲁棒语言模型研究作出了重要贡献。这些发现不仅夯实了鲁棒语言模型的研究基础,还凸显了图引导预训练策略在现实应用中的巨大潜力。