We propose a method to determine whether a given article was written entirely by a generative language model or perhaps contains edits by a different author, possibly a human. Our process involves multiple tests for the origin of individual sentences or other pieces of text and combining these tests using a method that is sensitive to rare alternatives, i.e., non-null effects are few and scattered across the text in unknown locations. Interestingly, this method also identifies pieces of text suspected to contain edits. We demonstrate the effectiveness of the method in detecting edits through extensive evaluations using real data and provide an information-theoretic analysis of the factors affecting its success. In particular, we discuss optimality properties under a theoretical framework for text editing saying that sentences are generated mainly by the language model, except perhaps for a few sentences that might have originated via a different mechanism. Our analysis raises several interesting research questions at the intersection of information theory and data science.
翻译:我们提出了一种方法,用于判断给定文章是完全由生成式语言模型撰写,还是可能包含其他作者(可能是人类)的编辑内容。我们的流程包含对单个句子或其他文本片段来源的多重检验,并通过一种对罕见替代方案敏感的方法(即非零效应在文本中数量稀少且位置未知)将这些检验结果进行整合。值得注意的是,该方法还能识别出疑似包含编辑的文本片段。我们通过使用真实数据进行广泛评估,证明了该方法在检测编辑方面的有效性,并对影响其成功率的因素进行了信息论分析。特别地,我们在文本编辑的理论框架下讨论了最优性特性,该框架认为句子主要由语言模型生成,但可能存在少数通过不同机制产生的句子。我们的分析提出了信息论与数据科学交叉领域的若干有趣研究问题。