Generation-time text watermarking embeds statistical signals into text for traceability of AI-generated content. We explore *post-hoc watermarking* where an LLM rewrites existing text while applying generation-time watermarking, to protect copyrighted documents, or detect their use in training or RAG via watermark radioactivity. Unlike generation-time approaches, which is constrained by how LLMs are served, this setting offers additional degrees of freedom for both generation and detection. We investigate how allocating compute (through larger rephrasing models, beam search, multi-candidate generation, or entropy filtering at detection) affects the quality-detectability trade-off. Our strategies achieve strong detectability and semantic fidelity on open-ended text such as books. Among our findings, the simple Gumbel-max scheme surprisingly outperforms more recent alternatives under nucleus sampling, and most methods benefit significantly from beam search. However, most approaches struggle when watermarking verifiable text such as code, where we counterintuitively find that smaller models outperform larger ones. This study reveals both the potential and limitations of post-hoc watermarking, laying groundwork for practical applications and future research.
翻译:生成时文本水印通过将统计信号嵌入文本来实现AI生成内容的溯源。本研究探索*后置水印*方法:利用大语言模型对现有文本进行重写并同时应用生成时水印技术,旨在保护受版权保护的文档,或通过水印放射性检测其在训练或检索增强生成中的使用。与受限于大语言模型服务方式的生成时方法不同,该设置为生成和检测过程提供了额外的自由度。我们研究了计算资源分配(通过使用更大的重述模型、束搜索、多候选生成或在检测阶段采用熵过滤)如何影响质量与可检测性之间的权衡。我们的策略在书籍等开放式文本上实现了强大的可检测性和语义保真度。研究发现,在核采样条件下,简单的Gumbel-max方案意外地优于近期提出的替代方法,且大多数方法能显著受益于束搜索。然而,当对代码等可验证文本进行水印处理时,大多数方法表现欠佳——我们反直觉地发现较小模型反而优于较大模型。本研究揭示了后置水印技术的潜力与局限,为实际应用和未来研究奠定了基础。