Watermarking has been proposed as a lightweight mechanism to identify AI-generated text, with schemes typically relying on perturbations to token distributions. While prior work shows that paraphrasing can weaken such signals, these attacks remain partially detectable or degrade text quality. We demonstrate that cross-lingual summarization attacks (CLSA) -- translation to a pivot language followed by summarization and optional back-translation -- constitute a qualitatively stronger attack vector. By forcing a semantic bottleneck across languages, CLSA systematically destroys token-level statistical biases while preserving semantic fidelity. In experiments across multiple watermarking schemes (KGW, SIR, XSIR, Unigram) and five languages (Amharic, Chinese, Hindi, Spanish, Swahili), we show that CLSA reduces watermark detection accuracy more effectively than monolingual paraphrase at similar quality levels. Our results highlight an underexplored vulnerability that challenges the practicality of watermarking for provenance or regulation. We argue that robust provenance solutions must move beyond distributional watermarking and incorporate cryptographic or model-attestation approaches. On 300 held-out samples per language, CLSA consistently drives detection toward chance while preserving task utility. Concretely, for XSIR (explicitly designed for cross-lingual robustness), AUROC with paraphrasing is $0.827$, with Cross-Lingual Watermark Removal Attacks (CWRA) [He et al., 2024] using Chinese as the pivot, it is $0.823$, whereas CLSA drives it down to $0.53$ (near chance). Results highlight a practical, low-cost removal pathway that crosses languages and compresses content without visible artifacts.
翻译:水印技术已被提出作为一种轻量级机制来识别AI生成的文本,其方案通常依赖于对词元分布的扰动。尽管先前的研究表明,改写可以削弱此类信号,但这些攻击仍可被部分检测到或会降低文本质量。我们证明,跨语言摘要攻击(CLSA)——即翻译到枢纽语言,随后进行摘要和可选的回译——构成了一种质量上更强的攻击向量。通过强制跨语言的语义瓶颈,CLSA系统性地破坏了词元级别的统计偏差,同时保持了语义保真度。在针对多种水印方案(KGW、SIR、XSIR、Unigram)和五种语言(阿姆哈拉语、中文、印地语、西班牙语、斯瓦希里语)的实验中,我们表明,在相似的质量水平下,CLSA比单语言改写更有效地降低了水印检测的准确率。我们的结果突显了一个未被充分探索的脆弱性,这对水印技术用于来源追溯或监管的实用性提出了挑战。我们认为,稳健的来源追溯解决方案必须超越分布水印,并纳入密码学或模型认证方法。在每种语言300个保留样本上,CLSA持续将检测推向随机水平,同时保持了任务效用。具体而言,对于XSIR(明确设计用于跨语言鲁棒性),使用改写的AUROC为$0.827$,使用中文作为枢纽的跨语言水印移除攻击(CWRA)[He等人,2024]为$0.823$,而CLSA将其降至$0.53$(接近随机)。结果突显了一种实用、低成本的移除途径,该途径跨越语言并压缩内容,且无可见伪影。