To combat the misuse of Large Language Models (LLMs), many recent studies have presented LLM-generated-text detectors with promising performance. When users instruct LLMs to generate texts, the instruction can include different constraints depending on the user's need. However, most recent studies do not cover such diverse instruction patterns when creating datasets for LLM detection. In this paper, we reveal that even task-oriented constraints -- constraints that would naturally be included in an instruction and are not related to detection-evasion -- cause existing powerful detectors to have a large variance in detection performance. We focus on student essay writing as a realistic domain and manually create task-oriented constraints based on several factors for essay quality. Our experiments show that the standard deviation (SD) of current detector performance on texts generated by an instruction with such a constraint is significantly larger (up to an SD of 14.4 F1-score) than that by generating texts multiple times or paraphrasing the instruction. We also observe an overall trend where the constraints can make LLM detection more challenging than without them. Finally, our analysis indicates that the high instruction-following ability of LLMs fosters the large impact of such constraints on detection performance.
翻译:为应对大型语言模型(LLMs)的滥用问题,近期许多研究提出了性能优异的LLM生成文本检测器。当用户指示LLMs生成文本时,指令可根据用户需求包含不同的约束条件。然而,大多数近期研究在构建LLM检测数据集时并未涵盖此类多样化的指令模式。本文揭示:即使是任务导向的约束条件——即那些会自然包含在指令中且与规避检测无关的约束——也会导致现有高性能检测器的检测性能出现巨大波动。我们以学生论文写作这一现实领域为研究对象,基于多项论文质量要素手动构建了任务导向的约束条件。实验表明,当前检测器在包含此类约束的指令所生成文本上的性能标准差(SD)显著高于多次生成文本或对指令进行复述的情况(最高可达14.4个F1分数的SD)。我们还观察到整体趋势:相较于无约束条件的情况,约束条件往往会使LLM检测更具挑战性。最终,我们的分析表明,LLMs强大的指令遵循能力是此类约束对检测性能产生重大影响的关键因素。