We propose a method to determine whether a given article was entirely written by a generative language model versus an alternative situation in which the article includes some significant edits by a different author, possibly a human. Our process involves many perplexity tests for the origin of individual sentences or other text atoms, combining these multiple tests using Higher Criticism (HC). As a by-product, the method identifies parts suspected to be edited. The method is motivated by the convergence of the log-perplexity to the cross-entropy rate and by a statistical model for edited text saying that sentences are mostly generated by the language model, except perhaps for a few sentences that might have originated via a different mechanism. We demonstrate the effectiveness of our method using real data and analyze the factors affecting its success. This analysis raises several interesting open challenges whose resolution may improve the method's effectiveness.
翻译:我们提出一种方法,用于判断一篇文章是全部由生成式语言模型撰写,还是存在另一种情况:文章包含来自不同作者(可能为人类)的显著编辑。我们的过程涉及对单个句子或其他文本原子的来源进行多次困惑度测试,并通过高阶批判(HC)将这些多次测试的结果进行整合。作为副产品,该方法还能识别出被怀疑经过编辑的部分。该方法的设计动机源自对数困惑度向交叉熵率的收敛性,以及一个关于编辑文本的统计模型:该模型认为,大多数句子由语言模型生成,除少量句子可能通过不同机制产生。我们利用真实数据证明了该方法的有效性,并分析了影响其成功的关键因素。该分析提出了若干有趣的开放挑战,解决这些挑战可能进一步提升该方法的效果。