The advent of Large Language Models (LLMs) has made a transformative impact. However, the potential that LLMs such as ChatGPT can be exploited to generate misinformation has posed a serious concern to online safety and public trust. A fundamental research question is: will LLM-generated misinformation cause more harm than human-written misinformation? We propose to tackle this question from the perspective of detection difficulty. We first build a taxonomy of LLM-generated misinformation. Then we categorize and validate the potential real-world methods for generating misinformation with LLMs. Then, through extensive empirical investigation, we discover that LLM-generated misinformation can be harder to detect for humans and detectors compared to human-written misinformation with the same semantics, which suggests it can have more deceptive styles and potentially cause more harm. We also discuss the implications of our discovery on combating misinformation in the age of LLMs and the countermeasures.
翻译:大型语言模型(LLMs)的出现带来了变革性影响。然而,像ChatGPT这类LLMs可能被滥用于生成虚假信息,这已对网络安全和公众信任构成严重担忧。一个基础性的研究问题是:LLM生成的虚假信息会比人类编写的虚假信息造成更大危害吗?我们拟从检测难度入手探讨该问题。首先构建LLM生成虚假信息的分类体系,随后分类并验证使用LLMs生成虚假信息的潜在现实方法。通过大量实证研究,我们发现:与语义相同的人类编写虚假信息相比,LLM生成的虚假信息对人类和检测器而言更难以识别,这表明其可能具有更强的欺骗性,并可能造成更大危害。本文还讨论了该发现对LLM时代抵御虚假信息的启示及应对措施。