Linguistic steganography involves embedding secret messages within seemingly innocuous texts to enable covert communication. Provable security, which is a long-standing goal and key motivation, has been extended to language-model-based steganography. Previous provably secure approaches have achieved perfect imperceptibility, measured by zero Kullback-Leibler (KL) divergence, but at the expense of embedding capacity. In this paper, we attempt to directly use a classic entropy coding method (range coding) to achieve secure steganography, and then propose an efficient and provably secure linguistic steganographic method with a rotation mechanism. Experiments across various language models show that our method achieves around 100% entropy utilization (embedding efficiency) for embedding capacity, outperforming the existing baseline methods. Moreover, it achieves high embedding speeds (up to 1554.66 bits/s on GPT-2). The code is available at github.com/ryehr/RRC_steganography.
翻译:语言隐写术涉及将秘密信息嵌入看似无害的文本中,以实现隐蔽通信。可证明安全性作为长期追求的目标和关键动机,已被扩展到基于语言模型的隐写术。以往的可证明安全方法通过实现零Kullback-Leibler(KL)散度,达到了完美的不可感知性,但代价是嵌入容量降低。本文尝试直接使用经典熵编码方法(范围编码)实现安全隐写术,进而提出一种带有旋转机制的高效可证明安全语言隐写方法。在各种语言模型上的实验表明,该方法在嵌入容量方面实现了约100%的熵利用率(嵌入效率),优于现有基线方法。此外,它实现了高嵌入速度(在GPT-2上可达1554.66比特/秒)。代码详见github.com/ryehr/RRC_steganography。