The advent of Large Language Models (LLMs) has made a transformative impact. However, the potential that LLMs such as ChatGPT can be exploited to generate misinformation has posed a serious concern to online safety and public trust. A fundamental research question is: will LLM-generated misinformation cause more harm than human-written misinformation? We propose to tackle this question from the perspective of detection difficulty. We first build a taxonomy of LLM-generated misinformation. Then we categorize and validate the potential real-world methods for generating misinformation with LLMs. Then, through extensive empirical investigation, we discover that LLM-generated misinformation can be harder to detect for humans and detectors compared to human-written misinformation with the same semantics, which suggests it can have more deceptive styles and potentially cause more harm. We also discuss the implications of our discovery on combating misinformation in the age of LLMs and the countermeasures.
翻译:大语言模型的出现带来了变革性的影响。然而,ChatGPT等LLMs可能被利用生成虚假信息,这对网络安全和公众信任构成了严重关切。一个基础研究问题是:LLM生成的虚假信息是否比人类编写的虚假信息造成更大危害?我们提出从检测难度角度来解决这个问题。我们首先构建了LLM生成虚假信息的分类体系,然后对利用LLMs生成虚假信息的潜在现实方法进行分类和验证。通过大量实证研究,我们发现:在语义相同的情况下,相较于人类编写的虚假信息,LLM生成的虚假信息对人类和检测器而言更难被识别,这表明其可能具有更强的欺骗性风格并造成更大危害。我们还讨论了这一发现对LLM时代打击虚假信息的意义及应对策略。