Evaluating the degree of reproduction of copyright-protected content by language models (LMs) is of significant interest to the AI and legal communities. Although both literal and non-literal similarities are considered by courts when assessing the degree of reproduction, prior research has focused only on literal similarities. To bridge this gap, we introduce CopyBench, a benchmark designed to measure both literal and non-literal copying in LM generations. Using copyrighted fiction books as text sources, we provide automatic evaluation protocols to assess literal and non-literal copying, balanced against the model utility in terms of the ability to recall facts from the copyrighted works and generate fluent completions. We find that, although literal copying is relatively rare, two types of non-literal copying -- event copying and character copying -- occur even in models as small as 7B parameters. Larger models demonstrate significantly more copying, with literal copying rates increasing from 0.2% to 10.5% and non-literal copying from 2.3% to 6.9% when comparing Llama3-8B and 70B models, respectively. We further evaluate the effectiveness of current strategies for mitigating copying and show that (1) training-time alignment can reduce literal copying but may increase non-literal copying, and (2) current inference-time mitigation methods primarily reduce literal but not non-literal copying.
翻译:评估语言模型(LMs)对受版权保护内容的复现程度,对人工智能和法律界具有重要意义。尽管法院在评估复现程度时会同时考虑字面相似性和非字面相似性,但先前的研究仅聚焦于字面相似性。为填补这一空白,我们引入了CopyBench,这是一个旨在衡量语言模型生成中字面与非字面复制行为的基准。我们以受版权保护的小说作为文本来源,提供自动评估方案来评估字面与非字面复制,并平衡模型在回忆受版权作品事实和生成流畅续写方面的效用。我们发现,尽管字面复制相对罕见,但两种类型的非字面复制——事件复制和角色复制——即使在参数量小至7B的模型中也会出现。更大的模型表现出明显更多的复制行为,比较Llama3-8B和70B模型时,字面复制率从0.2%增加到10.5%,非字面复制率从2.3%增加到6.9%。我们进一步评估了当前缓解复制策略的有效性,结果表明:(1)训练阶段的对齐可以减少字面复制,但可能增加非字面复制;(2)当前推理阶段的缓解方法主要减少字面复制,而非非字面复制。