As the outputs of generative AI (GenAI) techniques improve in quality, it becomes increasingly challenging to distinguish them from human-created content. Watermarking schemes are a promising approach to address the problem of distinguishing between AI and human-generated content. These schemes embed hidden signals within AI-generated content to enable reliable detection. While watermarking is not a silver bullet for addressing all risks associated with GenAI, it can play a crucial role in enhancing AI safety and trustworthiness by combating misinformation and deception. This paper presents a comprehensive overview of watermarking techniques for GenAI, beginning with the need for watermarking from historical and regulatory perspectives. We formalize the definitions and desired properties of watermarking schemes and examine the key objectives and threat models for existing approaches. Practical evaluation strategies are also explored, providing insights into the development of robust watermarking techniques capable of resisting various attacks. Additionally, we review recent representative works, highlight open challenges, and discuss potential directions for this emerging field. By offering a thorough understanding of watermarking in GenAI, this work aims to guide researchers in advancing watermarking methods and applications, and support policymakers in addressing the broader implications of GenAI.
翻译:随着生成式人工智能(GenAI)技术输出质量的提升,其与人类创作内容之间的区分日益困难。水印方案为解决区分AI生成内容与人类创作内容的问题提供了一种前景广阔的方法。这些方案通过在AI生成内容中嵌入隐蔽信号,以实现可靠的检测。尽管水印并非应对GenAI相关所有风险的万能方案,但通过打击错误信息和欺骗行为,它能在提升AI安全性与可信度方面发挥关键作用。本文对GenAI水印技术进行了全面综述,首先从历史与监管视角阐述了水印技术的必要性。我们形式化定义了水印方案的概念与期望特性,并剖析了现有方法的核心目标与威胁模型。同时探讨了实际评估策略,为开发能够抵御各类攻击的鲁棒水印技术提供了见解。此外,本文综述了近期代表性研究成果,指出现有挑战,并讨论了这一新兴领域的潜在发展方向。通过对GenAI水印技术的深入解析,本研究旨在引导研究者推进水印方法与应用的创新,并为政策制定者应对GenAI更广泛的社会影响提供理论支持。