Steganography embeds confidential data within seemingly innocuous communications. Provable security in steganography, a long-sought goal, has become feasible with deep generative models. However, existing methods face a critical trade-off between security and efficiency. This paper introduces SparSamp, an efficient provably secure steganography method based on sparse sampling. SparSamp embeds messages by combining them with pseudo-random numbers to obtain message-derived random numbers for sampling. It enhances extraction accuracy and embedding capacity by increasing the sampling intervals and making the sampling process sparse. SparSamp preserves the original probability distribution of the generative model, thus ensuring security. It introduces only $O(1)$ additional complexity per sampling step, enabling the fastest embedding speed without compromising generation speed. SparSamp is designed to be plug-and-play; message embedding can be achieved by simply replacing the sampling component of an existing generative model with SparSamp. We implemented SparSamp in text, image, and audio generation models. It can achieve embedding speeds of up to 755 bits/second with GPT-2, 5046 bits/second with DDPM, and 9,223 bits/second with WaveRNN.
翻译:隐写术将机密数据嵌入看似无害的通信中。借助深度生成模型,隐写术中长期以来追求的可证明安全性目标已成为可能。然而,现有方法在安全性与效率之间存在关键的权衡。本文提出了SparSamp,一种基于稀疏采样的高效可证明安全隐写方法。SparSamp通过将消息与伪随机数结合以获取消息衍生的随机数进行采样,从而实现消息嵌入。它通过增大采样间隔并使采样过程稀疏化,提高了提取精度和嵌入容量。SparSamp保持了生成模型的原始概率分布,从而确保了安全性。它在每个采样步骤仅引入$O(1)$的额外复杂度,实现了最快的嵌入速度且不影响生成速度。SparSamp设计为即插即用;只需将现有生成模型的采样组件替换为SparSamp即可实现消息嵌入。我们在文本、图像和音频生成模型中实现了SparSamp。使用GPT-2时嵌入速度可达755比特/秒,使用DDPM时可达5046比特/秒,使用WaveRNN时可达9,223比特/秒。