We propose dgMARK, a decoding-guided watermarking method for discrete diffusion language models (dLLMs). Unlike autoregressive models, dLLMs can generate tokens in arbitrary order. While an ideal conditional predictor would be invariant to this order, practical dLLMs exhibit strong sensitivity to the unmasking order, creating a new channel for watermarking. dgMARK steers the unmasking order toward positions whose high-reward candidate tokens satisfy a simple parity constraint induced by a binary hash, without explicitly reweighting the model's learned probabilities. The method is plug-and-play with common decoding strategies (e.g., confidence, entropy, and margin-based ordering) and can be strengthened with a one-step lookahead variant. Watermarks are detected via elevated parity-matching statistics, and a sliding-window detector ensures robustness under post-editing operations including insertion, deletion, substitution, and paraphrasing.
翻译:本文提出dgMARK,一种面向离散扩散语言模型(dLLMs)的解码引导水印方法。与自回归模型不同,dLLMs能够以任意顺序生成词元。虽然理想的条件预测器应具有顺序不变性,但实际dLLMs对解掩顺序表现出强烈敏感性,这为水印技术开辟了新通道。dgMARK通过引导解掩顺序指向特定位置——这些位置的高奖励候选词元需满足由二进制哈希函数诱导的简单奇偶约束,而无需显式重加权模型已学习的概率分布。该方法可与常见解码策略(如基于置信度、熵和边际的排序策略)即插即用,并可通过单步前瞻变体进一步增强。水印检测通过统计奇偶匹配率的异常升高实现,滑动窗口检测器则确保其在插入、删除、替换和改写等后编辑操作下保持鲁棒性。