Since the remarkable generation performance of large language models raised ethical and legal concerns, approaches to detect machine-generated text by embedding watermarks are being developed. However, we discover that the existing works fail to function appropriately in code generation tasks due to the task's nature of having low entropy. Extending a logit-modifying watermark method, we propose Selective WatErmarking via Entropy Thresholding (SWEET), which enhances detection ability and mitigates code quality degeneration by removing low-entropy segments at generating and detecting watermarks. Our experiments show that SWEET significantly improves code quality preservation while outperforming all baselines, including post-hoc detection methods, in detecting machine-generated code text. Our code is available in https://github.com/hongcheki/sweet-watermark.
翻译:鉴于大型语言模型卓越的生成性能引发了伦理和法律担忧,目前正在开发通过嵌入水印来检测机器生成文本的方法。然而,我们发现,由于代码生成任务本身具有低熵的特性,现有方法在该任务中无法正常发挥作用。通过扩展一种基于对数概率修改的水印方法,我们提出了基于熵阈值的选择性水印技术(SWEET)。该方法通过在生成和检测水印时移除低熵片段,从而提升检测能力并缓解代码质量退化问题。实验表明,SWEET在显著改善代码质量保持的同时,其检测机器生成代码文本的性能优于所有基线方法(包括事后检测方法)。我们的代码可在 https://github.com/hongcheki/sweet-watermark 获取。