We construct pseudorandom error-correcting codes (or simply pseudorandom codes), which are error-correcting codes with the property that any polynomial number of codewords are pseudorandom to any computationally-bounded adversary. Efficient decoding of corrupted codewords is possible with the help of a decoding key. We build pseudorandom codes that are robust to substitution and deletion errors, where pseudorandomness rests on standard cryptographic assumptions. Specifically, pseudorandomness is based on either $2^{O(\sqrt{n})}$-hardness of LPN, or polynomial hardness of LPN and the planted XOR problem at low density. As our primary application of pseudorandom codes, we present an undetectable watermarking scheme for outputs of language models that is robust to cropping and a constant rate of random substitutions and deletions. The watermark is undetectable in the sense that any number of samples of watermarked text are computationally indistinguishable from text output by the original model. This is the first undetectable watermarking scheme that can tolerate a constant rate of errors. Our second application is to steganography, where a secret message is hidden in innocent-looking content. We present a constant-rate stateless steganography scheme with robustness to a constant rate of substitutions. Ours is the first stateless steganography scheme with provable steganographic security and any robustness to errors.
翻译:我们构造了伪随机纠错码(简称伪随机码),即具有以下性质的纠错码:对于任何计算能力有界的对手而言,任意多项式数量的码字都是伪随机的。在解码密钥的辅助下,可以对受损码字进行高效解码。我们构建了能够抵抗替换和删除错误的伪随机码,其伪随机性基于标准的密码学假设。具体而言,伪随机性依赖于LPN问题的$2^{O(\sqrt{n})}$难度假设,或基于LPN问题与低密度植入XOR问题的多项式难度假设。作为伪随机码的主要应用,我们提出了一种针对语言模型输出的不可检测水印方案,该方案能够抵抗裁剪操作以及恒定速率的随机替换与删除错误。其不可检测性体现在:任何数量的水印文本样本在计算上与原始模型输出的文本不可区分。这是首个能够容忍恒定错误率的不可检测水印方案。我们的第二个应用是隐写术,即将秘密信息隐藏于看似无害的内容中。我们提出了一种恒定速率的无状态隐写方案,能够抵抗恒定速率的替换错误。这是首个兼具可证明的隐写安全性且对错误具有一定鲁棒性的无状态隐写方案。