Learning the Legibility of Visual Text Perturbations

Many adversarial attacks in NLP perturb inputs to produce visually similar strings ('ergo' $\rightarrow$ '$\epsilon$rgo') which are legible to humans but degrade model performance. Although preserving legibility is a necessary condition for text perturbation, little work has been done to systematically characterize it; instead, legibility is typically loosely enforced via intuitions around the nature and extent of perturbations. Particularly, it is unclear to what extent can inputs be perturbed while preserving legibility, or how to quantify the legibility of a perturbed string. In this work, we address this gap by learning models that predict the legibility of a perturbed string, and rank candidate perturbations based on their legibility. To do so, we collect and release LEGIT, a human-annotated dataset comprising the legibility of visually perturbed text. Using this dataset, we build both text- and vision-based models which achieve up to $0.91$ F1 score in predicting whether an input is legible, and an accuracy of $0.86$ in predicting which of two given perturbations is more legible. Additionally, we discover that legible perturbations from the LEGIT dataset are more effective at lowering the performance of NLP models than best-known attack strategies, suggesting that current models may be vulnerable to a broad range of perturbations beyond what is captured by existing visual attacks. Data, code, and models are available at https://github.com/dvsth/learning-legibility-2023.

翻译：许多NLP中的对抗性攻击会扰动输入，生成视觉上相似的字符串（如 'ergo' → 'εrgo'），这些字符串对人类可读但会降低模型性能。尽管保持可读性是文本扰动的必要条件，但鲜有工作对其进行系统表征；相反，可读性通常通过关于扰动性质和程度的直觉来松散地强制执行。特别地，目前尚不清楚在保持可读性的前提下输入能被扰动到何种程度，以及如何量化扰动字符串的可读性。在本工作中，我们通过学习模型来预测扰动字符串的可读性，并根据可读性对候选扰动进行排序，从而填补这一空白。为此，我们收集并发布了LEGIT数据集，这是一个包含视觉扰动文本可读性的人类标注数据集。利用该数据集，我们构建了基于文本和基于视觉的模型，在预测输入是否可读时达到了高达$0.91$的F1分数，在预测两个给定扰动中哪个更可读时达到了$0.86$的准确率。此外，我们发现LEGIT数据集中的可读扰动在降低NLP模型性能方面比已知的最佳攻击策略更有效，这表明当前模型可能容易受到超出现有视觉攻击所涵盖范围的广泛扰动的影响。数据、代码和模型可在 https://github.com/dvsth/learning-legibility-2023 获取。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

52+阅读 · 2020年12月14日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日