Steganography is the practice of encoding secret information into innocuous content in such a manner that an adversarial third party would not realize that there is hidden meaning. While this problem has classically been studied in security literature, recent advances in generative models have led to a shared interest among security and machine learning researchers in developing scalable steganography techniques. In this work, we show that a steganography procedure is perfectly secure under Cachin (1998)'s information-theoretic model of steganography if and only if it is induced by a coupling. Furthermore, we show that, among perfectly secure procedures, a procedure maximizes information throughput if and only if it is induced by a minimum entropy coupling. These insights yield what are, to the best of our knowledge, the first steganography algorithms to achieve perfect security guarantees for arbitrary covertext distributions. To provide empirical validation, we compare a minimum entropy coupling-based approach to three modern baselines -- arithmetic coding, Meteor, and adaptive dynamic grouping -- using GPT-2, WaveRNN, and Image Transformer as communication channels. We find that the minimum entropy coupling-based approach achieves superior encoding efficiency, despite its stronger security constraints. In aggregate, these results suggest that it may be natural to view information-theoretic steganography through the lens of minimum entropy coupling.
翻译:隐写术是将秘密信息编码到无害内容中,使得对抗第三方无法察觉存在隐藏含义的一种技术。尽管该问题在安全文献中已有经典研究,但生成式模型的最新进展促使安全领域与机器学习领域的研究者共同探索可扩展的隐写技术。本研究表明,在Cachin(1998)信息论隐写模型下,隐写过程为完美安全的充要条件是其由耦合诱导生成。进一步地,在完美安全过程中,算法实现信息吞吐量最大化的充要条件是其由最小熵耦合诱导生成。基于这些发现,我们提出了据我们所知首个能够针对任意封面分布实现完美安全保证的隐写算法。为进行实证验证,我们以GPT-2、WaveRNN和Image Transformer作为通信信道,将基于最小熵耦合的方法与算术编码、Meteor和自适应动态分组三种现代基线方法进行对比。结果表明,尽管具有更强的安全约束,基于最小熵耦合的方法仍实现了更优的编码效率。综合来看,这些发现表明通过最小熵耦合的视角审视信息论隐写术具有自然合理性。