Our work focuses on the challenge of detecting outputs generated by Large Language Models (LLMs) to distinguish them from those generated by humans. This ability is of the utmost importance in numerous applications. However, the possibility of such discernment has been the subject of debate within the community. Therefore, a central question is whether we can detect AI-generated text and, if so, when. In this work, we provide evidence that it should almost always be possible to detect AI-generated text unless the distributions of human and machine-generated texts are exactly the same over the entire support. This observation follows from the standard results in information theory and relies on the fact that if the machine text becomes more human-like, we need more samples to detect it. We derive a precise sample complexity bound of AI-generated text detection, which tells how many samples are needed to detect AI-generated text. This gives rise to additional challenges of designing more complicated detectors that take in $n$ samples for detection (rather than just one), which is the scope of future research on this topic. Our empirical evaluations on various real and synthetic datasets support our claim about the existence of better detectors, demonstrating that AI-generated text detection should be achievable in the majority of scenarios. Our theory and results align with OpenAI's empirical findings, (in relation to sequence length), and we are the first to provide a solid theoretical justification for these outcomes.
翻译:本研究聚焦于检测大型语言模型(LLMs)生成的输出以区分人类与机器文本的挑战。这种区分能力在众多应用中至关重要。然而,学界对此类可辨别性一直存在争议。因此,核心问题在于我们能否检测AI生成文本,以及何时能够实现。本文证明:除非人类与机器生成文本的分布在完整支撑集上完全相同,否则AI生成文本几乎总可被检测。该结论基于信息论经典结论,依赖于一个事实——当机器文本愈发接近人类文本时,需要更多样本才能实现检测。我们推导了AI生成文本检测的精确样本复杂度界,揭示了检测所需样本数量。这引出了设计更复杂检测器的新挑战——该检测器需基于n个样本(而非单一文本)进行判断,这正是该领域未来研究的范畴。我们在多种真实与合成数据集上的实证评估支持了更优检测器存在的论断,表明在绝大多数场景中AI生成文本检测具有可行性。本研究的理论推导与实验结果与OpenAI的实证发现(关于序列长度的结论)吻合,且我们首次为这些结论提供了严谨的理论依据。