The recent large language models (LLMs), e.g., ChatGPT, have been able to generate human-like and fluent responses when provided with specific instructions. While admitting the convenience brought by technological advancement, educators also have concerns that students might leverage LLMs to complete their writing assignments and pass them off as their original work. Although many AI content detection studies have been conducted as a result of such concerns, most of these prior studies modeled AI content detection as a classification problem, assuming that a text is either entirely human-written or entirely AI-generated. In this study, we investigated AI content detection in a rarely explored yet realistic setting where the text to be detected is collaboratively written by human and generative LLMs (i.e., hybrid text). We first formalized the detection task as identifying the transition points between human-written content and AI-generated content from a given hybrid text (boundary detection). Then we proposed a two-step approach where we (1) separated AI-generated content from human-written content during the encoder training process; and (2) calculated the distances between every two adjacent prototypes and assumed that the boundaries exist between the two adjacent prototypes that have the furthest distance from each other. Through extensive experiments, we observed the following main findings: (1) the proposed approach consistently outperformed the baseline methods across different experiment settings; (2) the encoder training process can significantly boost the performance of the proposed approach; (3) when detecting boundaries for single-boundary hybrid essays, the proposed approach could be enhanced by adopting a relatively large prototype size, leading to a 22% improvement in the In-Domain evaluation and an 18% improvement in the Out-of-Domain evaluation.
翻译:近期以ChatGPT为代表的大型语言模型(LLMs)已能根据具体指令生成类人化且流畅的文本。在承认技术进步带来便利的同时,教育工作者也担忧学生可能利用LLMs完成写作作业并冒充原创作品。尽管这种担忧催生了大量AI内容检测研究,但现有研究大多将此类检测建模为分类问题,假设文本要么完全由人类撰写,要么完全由AI生成。本研究在鲜有探索却更贴近实际的场景下开展AI内容检测研究——即待检测文本由人类与生成式LLMs协作完成(混合文本)。我们首先将检测任务形式化为从给定混合文本中识别人类撰写内容与AI生成内容之间的过渡点(边界检测)。接着提出两步法:(1)在编码器训练过程中将AI生成内容与人类撰写内容分离;(2)计算每两个相邻原型间的距离,假设边界存在于距离最大的两个相邻原型之间。通过大量实验,我们获得以下主要发现:(1)所提方法在不同实验设置下均稳定优于基线方法;(2)编码器训练过程能显著提升所提方法性能;(3)在单边界混合文本边界检测中,采用较大原型尺寸可增强所提方法性能,使域内评估提升22%,域外评估提升18%。