With the development of large language models, multiple AIs are now made available for code generation (such as ChatGPT and StarCoder) and are adopted widely. It is often desirable to know whether a piece of code is generated by AI, and furthermore, which AI is the author. For instance, if a certain version of AI is known to generate vulnerable codes, it is particularly important to know the creator. Existing approaches are not satisfactory as watermarking codes are more challenging compared to watermarking text data, as codes can be altered with relative ease via widely-used code refactoring methods. In this work, we propose ACW (AI Code Watermarking), a novel method for watermarking AI-generated codes. The key idea of ACW is to selectively apply a set of carefully-designed semantic-preserving, idempotent code transformations, whose presence (or absence) allows us to determine the existence of the watermark. It is efficient as it requires no training or fine-tuning and works in a black-box manner. It is resilient as the watermark cannot be easily removed or tampered through common code refactoring methods. Our experimental results show that ACW is effective (i.e., achieving high accuracy, true positive rates and false positive rates) and resilient, significantly outperforming existing approaches.
翻译:随着大型语言模型的发展,多种AI工具(如ChatGPT和StarCoder)已被广泛应用于代码生成。人们通常需要判断某段代码是否由AI生成,更进一步,需要明确其具体由哪个AI生成。例如,当某个特定版本的AI被已知会生成存在漏洞的代码时,追溯代码的生成来源就显得尤为重要。现有方法无法令人满意,因为与文本数据水印技术相比,代码水印面临更大挑战——通过广泛应用的代码重构方法,代码相对容易被篡改。本研究提出ACW(AI代码水印)方法,一种针对AI生成代码的新型水印技术。ACW的核心思想是选择性应用一组精心设计的语义保持型幂等代码变换,通过检测这些变换的存在(或缺失)来判定水印是否存在。该方法无需训练或微调,以黑盒方式运行,具有高效性;同时其水印难以通过常规代码重构方法移除或篡改,具备弹性。实验结果表明,ACW在准确性、真阳性率和假阳性率方面表现优异,且弹性显著优于现有方法。