The emergence of large language models (LLMs) has resulted in the production of LLM-generated texts that is highly sophisticated and almost indistinguishable from texts written by humans. However, this has also sparked concerns about the potential misuse of such texts, such as spreading misinformation and causing disruptions in the education system. Although many detection approaches have been proposed, a comprehensive understanding of the achievements and challenges is still lacking. This survey aims to provide an overview of existing LLM-generated text detection techniques and enhance the control and regulation of language generation models. Furthermore, we emphasize crucial considerations for future research, including the development of comprehensive evaluation metrics and the threat posed by open-source LLMs, to drive progress in the area of LLM-generated text detection.
翻译:大型语言模型(LLM)的出现导致生成的文本高度复杂,几乎与人类撰写的文本难以区分。然而,这也引发了人们对这些文本被滥用的担忧,例如传播错误信息和扰乱教育系统。尽管已有许多检测方法被提出,但对其成就与挑战仍缺乏全面理解。本综述旨在概述现有LLM生成文本的检测技术,并增强对语言生成模型的控制与规范。此外,我们强调未来研究的关键考量因素,包括开发全面的评估指标以及开源LLM构成的威胁,以推动LLM生成文本检测领域的发展。