Towards Automatic Boundary Detection for Human-AI Collaborative Hybrid Essay in Education

The recent large language models (LLMs), e.g., ChatGPT, have been able to generate human-like and fluent responses when provided with specific instructions. While admitting the convenience brought by technological advancement, educators also have concerns that students might leverage LLMs to complete their writing assignments and pass them off as their original work. Although many AI content detection studies have been conducted as a result of such concerns, most of these prior studies modeled AI content detection as a classification problem, assuming that a text is either entirely human-written or entirely AI-generated. In this study, we investigated AI content detection in a rarely explored yet realistic setting where the text to be detected is collaboratively written by human and generative LLMs (i.e., hybrid text). We first formalized the detection task as identifying the transition points between human-written content and AI-generated content from a given hybrid text (boundary detection). Then we proposed a two-step approach where we (1) separated AI-generated content from human-written content during the encoder training process; and (2) calculated the distances between every two adjacent prototypes and assumed that the boundaries exist between the two adjacent prototypes that have the furthest distance from each other. Through extensive experiments, we observed the following main findings: (1) the proposed approach consistently outperformed the baseline methods across different experiment settings; (2) the encoder training process can significantly boost the performance of the proposed approach; (3) when detecting boundaries for single-boundary hybrid essays, the proposed approach could be enhanced by adopting a relatively large prototype size, leading to a 22% improvement in the In-Domain evaluation and an 18% improvement in the Out-of-Domain evaluation.

翻译：近期大型语言模型（LLMs），如ChatGPT，已能在特定指令下生成类人且流畅的回复。教育工作者在承认技术进步带来便利的同时，也担忧学生可能利用LLMs完成写作作业并冒充原创作品。尽管此类担忧催生了许多AI内容检测研究，但多数早期研究将AI内容检测建模为分类问题，假定文本完全由人类撰写或完全由AI生成。本研究探讨了一个鲜有探索但更贴近现实的场景：待检测文本由人类与生成式LLMs协作完成（即混合文本）。我们首先将检测任务形式化为从给定混合文本中识别人类撰写内容与AI生成内容之间的转换点（边界检测）。继而提出两步法：（1）在编码器训练过程中分离AI生成内容与人类撰写内容；（2）计算每两个相邻原型间的距离，并假设边界存在于距离最远的相邻原型之间。通过大量实验，我们获得以下主要发现：（1）所提方法在不同实验设置下均稳定优于基线方法；（2）编码器训练过程能显著提升所提方法的性能；（3）在检测单边界混合论文时，采用较大的原型尺寸可增强所提方法，使域内评估提升22%，域外评估提升18%。