In this paper, we comprehensively investigate the potential misuse of modern Large Language Models (LLMs) for generating credible-sounding misinformation and its subsequent impact on information-intensive applications, particularly Open-Domain Question Answering (ODQA) systems. We establish a threat model and simulate potential misuse scenarios, both unintentional and intentional, to assess the extent to which LLMs can be utilized to produce misinformation. Our study reveals that LLMs can act as effective misinformation generators, leading to a significant degradation in the performance of ODQA systems. To mitigate the harm caused by LLM-generated misinformation, we explore three defense strategies: prompting, misinformation detection, and majority voting. While initial results show promising trends for these defensive strategies, much more work needs to be done to address the challenge of misinformation pollution. Our work highlights the need for further research and interdisciplinary collaboration to address LLM-generated misinformation and to promote responsible use of LLMs.
翻译:本文全面研究了现代大型语言模型(LLMs)被滥用于生成可信虚假信息的潜在风险,以及这种风险对信息密集型应用(尤其是开放域问答系统)的后续影响。我们建立了一个威胁模型,并模拟了无意和有意两种潜在滥用场景,以评估LLMs用于生成虚假信息的程度。研究表明,LLMs可充当高效的虚假信息生成器,导致开放域问答系统性能显著下降。为减轻LLM生成的虚假信息造成的危害,我们探索了三种防御策略:提示工程、虚假信息检测和多数投票。尽管初步结果表明这些防御策略具有良好前景,但应对虚假信息污染的挑战仍需大量工作。本研究凸显了开展进一步研究和跨学科合作以应对LLM生成的虚假信息、促进LLM负责任使用的必要性。