Linguistic steganography enables covert communication through embedding secret messages into innocuous texts; however, current methods face critical limitations in payload capacity and security. Traditional modification-based methods introduce detectable anomalies, while retrieval-based strategies suffer from low embedding capacity. Modern generative steganography leverages language models to generate natural stego text but struggles with limited entropy in token predictions, further constraining capacity. To address these issues, we propose an entropy-driven framework called RTMStega that integrates rank-based adaptive coding and context-aware decompression with normalized entropy. By mapping secret messages to token probability ranks and dynamically adjusting sampling via context-aware entropy-based adjustments, RTMStega achieves a balance between payload capacity and imperceptibility. Experiments across diverse datasets and models demonstrate that RTMStega triples the payload capacity of mainstream generative steganography, reduces processing time by over 50%, and maintains high text quality, offering a trustworthy solution for secure and efficient covert communication.
翻译:语言隐写术通过将秘密信息嵌入到无害文本中实现隐蔽通信;然而,当前方法在有效载荷容量和安全性方面面临关键局限。传统的基于修改的方法会引入可检测的异常,而基于检索的策略则受限于较低的嵌入容量。现代生成式隐写术利用语言模型生成自然隐写文本,但受限于标记预测中的有限熵,进一步制约了容量。为解决这些问题,我们提出一种名为RTMStega的熵驱动框架,该框架将基于排序的自适应编码和上下文感知解压缩与归一化熵相结合。通过将秘密信息映射至标记概率排序,并借助基于上下文感知熵的动态调整进行自适应采样,RTMStega在有效载荷容量与不可感知性之间实现了平衡。跨多种数据集和模型的实验表明,RTMStega将主流生成式隐写术的有效载荷容量提升至三倍,处理时间减少50%以上,同时保持高文本质量,为安全高效的隐蔽通信提供了可信赖的解决方案。