Recent provably secure linguistic steganography (PSLS) methods rely on mainstream autoregressive language models (ARMs) to address historically challenging tasks, that is, to disguise covert communication as ``innocuous'' natural language communication. However, due to the characteristic of sequential generation of ARMs, the stegotext generated by ARM-based PSLS methods will produce serious error propagation once it changes, making existing methods unavailable under an active tampering attack. To address this, we propose a robust, provably secure linguistic steganography with diffusion language models (DLMs). Unlike ARMs, DLMs can generate text in a partially parallel manner, allowing us to find robust positions for steganographic embedding that can be combined with error-correcting codes. Furthermore, we introduce error correction strategies, including pseudo-random error correction and neighborhood search correction, during steganographic extraction. Theoretical proof and experimental results demonstrate that our method is secure and robust. It can resist token ambiguity in stegotext segmentation and, to some extent, withstand token-level attacks of insertion, deletion, and substitution.
翻译:近期可证明安全语言隐写术方法依赖主流自回归语言模型,以应对历史上具有挑战性的任务——即将隐蔽通信伪装为"无害"的自然语言通信。然而,由于自回归语言模型顺序生成的特点,基于ARM的PSLS方法生成的隐写文本一旦发生改动,会产生严重的误差传播,使得现有方法在主动篡改攻击下失效。为解决此问题,我们提出一种基于扩散语言模型的鲁棒可证明安全语言隐写方法。与ARM不同,DLM能以部分并行的方式生成文本,使我们能够找到可与纠错码结合的鲁棒隐写嵌入位置。此外,我们在隐写提取阶段引入了纠错策略,包括伪随机纠错和邻域搜索纠错。理论证明与实验结果表明,我们的方法具备安全性与鲁棒性:能够抵抗隐写文本分词中的词汇歧义,并在一定程度上抵御插入、删除和替换等词汇级攻击。