Towards Automatic Boundary Detection for Human-AI Collaborative Hybrid Essay in Education

The recent large language models (LLMs), e.g., ChatGPT, have been able to generate human-like and fluent responses when provided with specific instructions. While admitting the convenience brought by technological advancement, educators also have concerns that students might leverage LLMs to complete their writing assignments and pass them off as their original work. Although many AI content detection studies have been conducted as a result of such concerns, most of these prior studies modeled AI content detection as a classification problem, assuming that a text is either entirely human-written or entirely AI-generated. In this study, we investigated AI content detection in a rarely explored yet realistic setting where the text to be detected is collaboratively written by human and generative LLMs (i.e., hybrid text). We first formalized the detection task as identifying the transition points between human-written content and AI-generated content from a given hybrid text (boundary detection). Then we proposed a two-step approach where we (1) separated AI-generated content from human-written content during the encoder training process; and (2) calculated the distances between every two adjacent prototypes and assumed that the boundaries exist between the two adjacent prototypes that have the furthest distance from each other. Through extensive experiments, we observed the following main findings: (1) the proposed approach consistently outperformed the baseline methods across different experiment settings; (2) the encoder training process can significantly boost the performance of the proposed approach; (3) when detecting boundaries for single-boundary hybrid essays, the proposed approach could be enhanced by adopting a relatively large prototype size, leading to a 22% improvement in the In-Domain evaluation and an 18% improvement in the Out-of-Domain evaluation.

翻译：近期大型语言模型（LLMs，如ChatGPT）在给定特定指令时已能生成类人且流畅的回应。在承认技术进步带来便利的同时，教育工作者也担忧学生可能利用LLMs完成写作作业并冒充原创作品。尽管此类担忧催生了大量AI内容检测研究，但既有研究大多将AI内容检测建模为分类问题，假设文本完全由人类撰写或完全由AI生成。本研究探索了一个鲜有涉足但更贴近现实的场景——需检测的文本由人类与生成式LLMs协作完成（即混合文本）。我们首先将检测任务形式化为从给定混合文本中定位人类撰写内容与AI生成内容之间的转换点（边界检测），继而提出两阶段方法：（1）在编码器训练过程中分离AI生成内容与人类撰写内容；（2）计算相邻原型对之间的间距，判定边界存在于间距最大的相邻原型对之间。通过大量实验获得以下主要发现：（1）该方法在不同实验设置下均显著优于基线方法；（2）编码器训练过程能有效提升方法性能；（3）在检测单边界混合作文时，采用较大的原型尺寸可使域内评估提升22%，域外评估提升18%。