Automated disinformation generation is often listed as an important risk associated with large language models (LLMs). The theoretical ability to flood the information space with disinformation content might have dramatic consequences for societies around the world. This paper presents a comprehensive study of the disinformation capabilities of the current generation of LLMs to generate false news articles in the English language. In our study, we evaluated the capabilities of 10 LLMs using 20 disinformation narratives. We evaluated several aspects of the LLMs: how good they are at generating news articles, how strongly they tend to agree or disagree with the disinformation narratives, how often they generate safety warnings, etc. We also evaluated the abilities of detection models to detect these articles as LLM-generated. We conclude that LLMs are able to generate convincing news articles that agree with dangerous disinformation narratives.
翻译:自动化虚假信息生成常被视为大型语言模型(LLMs)的重要风险之一。理论上海量虚假信息内容涌入信息空间的能力,可能给全球社会带来灾难性后果。本文对当前一代LLMs生成英文虚假新闻文章的虚假信息能力进行了全面研究。研究中,我们利用20种虚假信息叙事模式评估了10个LLMs的能力,重点考察了以下维度:生成新闻文章的逼真程度、对虚假信息叙事倾向认同或反对的强度、生成安全警告的频率等。此外,我们还评估了检测模型识别这些文章是否为LLM生成的能力。研究结论表明,LLMs能够生成与危险虚假信息叙事相符且令人信服的新闻文章。