Key doctrines, including novelty (patent), originality (copyright), and distinctiveness (trademark), turn on a shared empirical question: whether a body of work is meaningfully distinct from a relevant reference class. Yet analyses typically operationalize this set-level inquiry using item-level evidence: pairwise comparisons among exemplars. That unit-of-analysis mismatch may be manageable for finite corpora of human-created works, where it can be bridged by ad hoc aggregations. But it becomes acute for machine-generated works, where the object of evaluation is not a fixed set of works but a generative process with an effectively unbounded output space. We propose a distributional alternative: a two-sample test based on maximum mean discrepancy computed on semantic embeddings to determine if two creative processes-whether human or machine-produce statistically distinguishable output distributions. The test requires no task-specific training-obviating the need for discovery of proprietary training data to characterize the generative process-and is sample-efficient, often detecting differences with as few as 5-10 images and 7-20 texts. We validate the framework across three domains: handwritten digits (controlled images), patent abstracts (text), and AI-generated art (real-world images). We reveal a perceptual paradox: even when human evaluators distinguish AI outputs from human-created art with only about 58% accuracy, our method detects distributional distinctiveness. Our results present evidence contrary to the view that generative models act as mere regurgitators of training data. Rather than producing outputs statistically indistinguishable from a human baseline-as simple regurgitation would predict-they produce outputs that are semantically human-like yet stochastically distinct, suggesting their dominant function is as a semantic interpolator within a learned latent space.
翻译:包括新颖性(专利)、独创性(版权)和显著性(商标)在内的关键法律原则,都依赖于一个共同的实证问题:一个作品集合是否与相关参照类别存在有意义的区别。然而,分析通常通过项目层面的证据——即范例之间的成对比较——来操作这种集合层面的探究。这种分析单位的不匹配对于有限的人类创作作品库或许尚可管理,可以通过临时聚合来弥合。但对于机器生成的作品,这种不匹配变得尖锐,因为评估对象不是一个固定的作品集合,而是一个具有近乎无限输出空间的生成过程。我们提出一种分布性替代方案:一种基于语义嵌入计算的最大均值差异的双样本检验,用于判断两个创作过程——无论是人类还是机器——是否产生统计上可区分的输出分布。该检验无需针对特定任务进行训练——从而避免了为表征生成过程而需发现专有训练数据的需求——并且具有样本高效性,通常仅需5-10张图像和7-20个文本即可检测出差异。我们在三个领域验证了该框架:手写数字(受控图像)、专利摘要(文本)和AI生成艺术(真实世界图像)。我们揭示了一个感知悖论:即使人类评估者区分AI输出与人类创作艺术的准确率仅为约58%,我们的方法也能检测到分布独特性。我们的研究结果提供了与“生成模型仅仅是训练数据的复读机”这一观点相反的证据。它们并非产生在统计上与人类基线无法区分的输出——正如简单复读所预测的那样——而是产生语义上类人但在随机性上独特的输出,这表明其主要功能是在学习到的潜在空间内充当语义插值器。