Steganography Without Modification: Hidden Communication via LLM Seeds

We demonstrate that widely deployed Large Language Model (LLM) inference stacks harbor a steganographic channel that requires no modification to model weights, sampling code, or output distributions. The channel exploits a structural property of deterministic decoding: pseudo-random number generators (PRNGs) used in inverse-transform sampling produce a seed-dependent sequence of token-level probability intervals that can be reconstructed from the generated text alone. A sender encodes a secret message in the PRNG seed before generation; a receiver reconstructs the intervals and recovers the seed, and thus the hidden payload, by exhaustive search over the seed space. We formalize two operational modes. In the known-prompt setting, sender and receiver share the prompt, enabling exact interval reconstruction and perfect seed recovery via forced alignment. In the unknown-prompt setting, only the generated text is available; approximate interval reconstruction combined with a maximum-hit-count scoring strategy still permits reliable recovery from sufficiently long outputs. Extensive experiments across six model families and five heterogeneous text domains show that, in the known-prompt setting, full 32-bit seed recovery from the complete 2^32 candidate space achieves up to 100% accuracy, depending on model and text domain, within 300 tokens and under 35 seconds on a single GPU. In the unknown-prompt setting, recovery reaches near-perfect accuracy at 600-800 tokens in about 12 seconds. We further analyze the influence of prompting strategies, tokenization ambiguities, and sampling hyperparameters on channel reliability. Moreover, we discuss several applications of our results: First, it allows for the steganographic transmission of 32 bits, but also shows that ignorance of the prompt is not a valid security assumption.

翻译：我们证明，广泛部署的大语言模型（LLM）推理堆栈中存在一个隐写通道，该通道无需修改模型权重、采样代码或输出分布。该通道利用确定性解码的结构性特性：在逆变换采样中使用的伪随机数生成器（PRNG）会生成一个依赖于种子的令牌级概率区间序列，仅通过生成的文本即可重建该序列。发送方在生成前将秘密消息编码到PRNG种子中；接收方通过重建区间并通过穷举搜索种子空间恢复种子，从而获取隐藏载荷。我们形式化了两种操作模式。在已知提示设置下，发送方和接收方共享提示，使得可通过强制对齐实现精确的区间重建和完美的种子恢复。在未知提示设置下，仅生成的文本可用；结合近似区间重建与最大命中计数评分策略，仍可从足够长的输出中可靠恢复种子。跨六个模型家族和五个异构文本域的大量实验表明，在已知提示设置下，从完整的2^32候选空间中恢复全部32位种子，根据模型和文本域的不同，在300个令牌内及单个GPU上耗时35秒内可实现高达100%的准确率。在未知提示设置下，恢复可在约12秒内于600-800个令牌处达到近乎完美的准确率。我们进一步分析了提示策略、分词歧义和采样超参数对通道可靠性的影响。此外，我们讨论了本文结果的若干应用：首先，它允许32位的隐写传输，同时表明忽视提示并非有效的安全假设。