The remarkable capabilities of large-scale language models, such as ChatGPT, in text generation have incited awe and spurred researchers to devise detectors to mitigate potential risks, including misinformation, phishing, and academic dishonesty. Despite this, most previous studies, including HC3, have been predominantly geared towards creating detectors that differentiate between purely ChatGPT-generated texts and human-authored texts. This approach, however, fails to work on discerning texts generated through human-machine collaboration, such as ChatGPT-polished texts. Addressing this gap, we introduce a novel dataset termed HPPT (ChatGPT-polished academic abstracts), facilitating the construction of more robust detectors. It diverges from extant corpora by comprising pairs of human-written and ChatGPT-polished abstracts instead of purely ChatGPT-generated texts. Additionally, we propose the "Polish Ratio" method, an innovative measure of ChatGPT's involvement in text generation based on editing distance. It provides a mechanism to measure the degree of human originality in the resulting text. Our experimental results show our proposed model has better robustness on the HPPT dataset and two existing datasets (HC3 and CDB). Furthermore, the "Polish Ratio" we proposed offers a more comprehensive explanation by quantifying the degree of ChatGPT involvement, which indicates that a Polish Ratio value greater than 0.2 signifies ChatGPT involvement and a value exceeding 0.6 implies that ChatGPT generates most of the text.
翻译:大规模语言模型(如ChatGPT)在文本生成方面展现的卓越能力引发了广泛关注,并促使研究者开发检测器以缓解潜在风险,包括虚假信息、网络钓鱼和学术不端行为。然而,此前的大多数研究(包括HC3)主要致力于构建区分纯ChatGPT生成文本与人工撰写文本的检测器。这种方法无法有效识别人机协作生成的文本(例如经过ChatGPT润色的文本)。为解决这一空白,我们提出了名为HPPT(ChatGPT润色学术摘要)的新型数据集,旨在构建更具鲁棒性的检测器。该数据集与现有语料库的不同之处在于,它包含成对的人工撰写摘要与ChatGPT润色摘要,而非纯ChatGPT生成文本。此外,我们提出了“润色比”方法——一种基于编辑距离的创新性度量指标,用于衡量ChatGPT在文本生成中的参与程度。该方法能够量化最终文本中人类原创性的保留程度。实验结果表明,我们提出的模型在HPPT数据集及两个现有数据集(HC3和CDB)上展现出更优的鲁棒性。同时,“润色比”方法通过量化ChatGPT的参与程度提供了更全面的解释:润色比值大于0.2表示存在ChatGPT参与,超过0.6则表明文本大部分由ChatGPT生成。