We present the results and the main findings of SemEval-2024 Task 8: Multigenerator, Multidomain, and Multilingual Machine-Generated Text Detection. The task featured three subtasks. Subtask A is a binary classification task determining whether a text is written by a human or generated by a machine. This subtask has two tracks: a monolingual track focused solely on English texts and a multilingual track. Subtask B is to detect the exact source of a text, discerning whether it is written by a human or generated by a specific LLM. Subtask C aims to identify the changing point within a text, at which the authorship transitions from human to machine. The task attracted a large number of participants: subtask A monolingual (126), subtask A multilingual (59), subtask B (70), and subtask C (30). In this paper, we present the task, analyze the results, and discuss the system submissions and the methods they used. For all subtasks, the best systems used LLMs.
翻译:我们展示了SemEval-2024任务8“多生成源、多领域与多语言机器生成文本检测”的结果与主要发现。该任务包含三个子任务:子任务A为二分类任务,判断文本由人类撰写还是机器生成,下设两个分支——单语言分支(仅针对英语文本)与多语言分支;子任务B要求检测文本的确切来源,区分其由人类撰写还是由特定大语言模型(LLM)生成;子任务C旨在识别文本中作者身份从人类过渡至机器的变化点。本任务吸引了大量参与者:子任务A单语言分支126队、子任务A多语言分支59队、子任务B 70队、子任务C 30队。本文介绍了任务内容、分析了结果,并讨论了各系统的提交方案及其采用的方法。在所有子任务中,最优系统均使用了LLM。