Large language models (LLMs) such as ChatGPT have exhibited remarkable performance in generating human-like texts. However, machine-generated texts (MGTs) may carry critical risks, such as plagiarism issues, misleading information, or hallucination issues. Therefore, it is very urgent and important to detect MGTs in many situations. Unfortunately, it is challenging to distinguish MGTs and human-written texts because the distributional discrepancy between them is often very subtle due to the remarkable performance of LLMs. In this paper, we seek to exploit \textit{maximum mean discrepancy} (MMD) to address this issue in the sense that MMD can well identify distributional discrepancies. However, directly training a detector with MMD using diverse MGTs will incur a significantly increased variance of MMD since MGTs may contain \textit{multiple text populations} due to various LLMs. This will severely impair MMD's ability to measure the difference between two samples. To tackle this, we propose a novel \textit{multi-population} aware optimization method for MMD called MMD-MP, which can \textit{avoid variance increases} and thus improve the stability to measure the distributional discrepancy. Relying on MMD-MP, we develop two methods for paragraph-based and sentence-based detection, respectively. Extensive experiments on various LLMs, \eg, GPT2 and ChatGPT, show superior detection performance of our MMD-MP. The source code is available at \url{https://github.com/ZSHsh98/MMD-MP}.
翻译:诸如ChatGPT等大型语言模型在生成人类化文本方面表现出色。然而,机器生成文本可能带来严重风险,例如抄袭问题、误导性信息或幻觉问题。因此,在许多情况下检测机器生成文本非常紧迫且重要。不幸的是,由于大型语言模型的卓越性能,机器生成文本与人类撰写文本之间的分布差异通常非常细微,这使得区分两者具有挑战性。本文旨在利用最大均值差异来解决这一问题,因为最大均值差异能够有效识别分布差异。然而,使用多种机器生成文本直接通过最大均值差异训练检测器会导致其方差显著增加,这是因为不同大型语言模型生成的文本可能包含多个文本群体,这会严重削弱最大均值差异衡量两个样本间差异的能力。为解决这一问题,我们提出了一种新颖的多群体感知优化方法MMD-MP,该方法能够避免方差增加,从而提升衡量分布差异的稳定性。基于MMD-MP,我们分别开发了适用于段落级和句子级检测的两种方法。在多个大型语言模型(如GPT2和ChatGPT)上的大量实验表明,我们的MMD-MP具有优越的检测性能。源代码见\url{https://github.com/ZSHsh98/MMD-MP}。