Contrastive learning has proven to be an effective method for pre-training models using weakly labeled data in the vision domain. Sentence transformers are the NLP counterparts to this architecture, and have been growing in popularity due to their rich and effective sentence representations. Having effective sentence representations is paramount in multiple tasks, such as information retrieval, retrieval augmented generation (RAG), and sentence comparison. Keeping in mind the deployability factor of transformers, evaluating the robustness of sentence transformers is of utmost importance. This work focuses on evaluating the robustness of the sentence encoders. We employ several adversarial attacks to evaluate its robustness. This system uses character-level attacks in the form of random character substitution, word-level attacks in the form of synonym replacement, and sentence-level attacks in the form of intra-sentence word order shuffling. The results of the experiments strongly undermine the robustness of sentence encoders. The models produce significantly different predictions as well as embeddings on perturbed datasets. The accuracy of the models can fall up to 15 percent on perturbed datasets as compared to unperturbed datasets. Furthermore, the experiments demonstrate that these embeddings does capture the semantic and syntactic structure (sentence order) of sentences. However, existing supervised classification strategies fail to leverage this information, and merely function as n-gram detectors.
翻译:对比学习已被证明是利用视觉领域弱标注数据预训练模型的有效方法。句子变换器是自然语言处理中对应此架构的模型,因其丰富且有效的句子表示而日益流行。在信息检索、检索增强生成及句子对比等多任务中,拥有高效的句子表示至关重要。考虑到变换器的可部署性因素,评估句子变换器的鲁棒性具有极其重要的意义。本研究聚焦于评估句子编码器的鲁棒性,采用多种对抗攻击方法进行测试。该系统使用字符级攻击(随机字符替换)、词级攻击(同义词替换)以及句子级攻击(句内词序打乱)等策略。实验结果表明,句子编码器的鲁棒性受到严重削弱:模型在扰动数据集上产生的预测结果和嵌入向量存在显著差异。相较于未扰动数据集,模型在扰动数据集上的准确率下降幅度可达15%。此外,实验证实这些嵌入向量能够捕捉句子的语义与句法结构(即词序),但现有监督分类策略未能利用这一信息,仅发挥n-gram检测器的作用。