Machine generated text is increasingly difficult to distinguish from human authored text. Powerful open-source models are freely available, and user-friendly tools that democratize access to generative models are proliferating. ChatGPT, which was released shortly after the first edition of this survey, epitomizes these trends. The great potential of state-of-the-art natural language generation (NLG) systems is tempered by the multitude of avenues for abuse. Detection of machine generated text is a key countermeasure for reducing abuse of NLG models, with significant technical challenges and numerous open problems. We provide a survey that includes both 1) an extensive analysis of threat models posed by contemporary NLG systems, and 2) the most complete review of machine generated text detection methods to date. This survey places machine generated text within its cybersecurity and social context, and provides strong guidance for future work addressing the most critical threat models, and ensuring detection systems themselves demonstrate trustworthiness through fairness, robustness, and accountability.
翻译:机器生成文本日益难以与人类创作文本区分。强大的开源模型可自由获取,而用户友好型工具正普及生成式模型的访问权限——本综述初版发布后不久面世的ChatGPT正是这一趋势的典型代表。先进自然语言生成系统的巨大潜力,与其被滥用的多种途径形成鲜明对比。机器生成文本检测作为减少NLG模型滥用的关键对策,面临着显著的技术挑战与众多开放性问题。本综述包含两大部分:1)对当代NLG系统引发的威胁模型进行深入分析;2)提供迄今为止最完整的机器生成文本检测方法回顾。本研究将机器生成文本置于网络安全与社会语境中,为未来工作指明关键威胁模型应对方向,并确保检测系统通过公平性、鲁棒性与可问责性展现可信赖特质。