The recent large language models (LLMs), e.g., ChatGPT, have been able to generate human-like and fluent responses when provided with specific instructions. While admitting the convenience brought by technological advancement, educators also have concerns that students might leverage LLMs to complete their writing assignments and pass them off as their original work. Although many AI content detection studies have been conducted as a result of such concerns, most of these prior studies modeled AI content detection as a classification problem, assuming that a text is either entirely human-written or entirely AI-generated. In this study, we investigated AI content detection in a rarely explored yet realistic setting where the text to be detected is collaboratively written by human and generative LLMs (i.e., hybrid text). We first formalized the detection task as identifying the transition points between human-written content and AI-generated content from a given hybrid text (boundary detection). Then we proposed a two-step approach where we (1) separated AI-generated content from human-written content during the encoder training process; and (2) calculated the distances between every two adjacent prototypes and assumed that the boundaries exist between the two adjacent prototypes that have the furthest distance from each other. Through extensive experiments, we observed the following main findings: (1) the proposed approach consistently outperformed the baseline methods across different experiment settings; (2) the encoder training process can significantly boost the performance of the proposed approach; (3) when detecting boundaries for single-boundary hybrid essays, the proposed approach could be enhanced by adopting a relatively large prototype size, leading to a 22% improvement in the In-Domain evaluation and an 18% improvement in the Out-of-Domain evaluation.
翻译:近期大型语言模型(LLMs),例如ChatGPT,在给定特定指令时已能生成类人且流畅的回复。尽管认可技术发展带来的便利,教育工作者亦担忧学生可能利用LLMs完成写作作业并将其冒充为原创作品。尽管此类担忧催生了诸多AI内容检测研究,但以往多数研究将AI内容检测建模为分类问题,假设文本完全由人类撰写或完全由AI生成。本研究探讨了一种很少被探索但更具现实意义的AI内容检测场景:待检测文本由人类与生成式LLMs协同撰写(即混合文本)。我们首先将检测任务形式化为从给定混合文本中识别人类撰写内容与AI生成内容之间的转换点(边界检测)。随后提出一种两步法:(1)在编码器训练过程中分离AI生成内容与人类撰写内容;(2)计算每两个相邻原型之间的距离,假设边界存在于距离最远的相邻原型之间。通过大量实验,我们获得以下主要发现:(1)所提方法在不同实验设置下始终优于基线方法;(2)编码器训练过程能显著提升所提方法的性能;(3)在检测单边界混合论文边界时,采用较大的原型尺寸可增强所提方法性能,使域内评估提升22%,域外评估提升18%。