Large Language Models (LLMs) can generate highly persuasive text, raising concerns about their misuse for propaganda, manipulation, and other harmful purposes. This leads us to our central question: Is LLM-generated persuasion more difficult to automatically detect than human-written persuasion? To address this, we categorize controllable generation approaches for producing persuasive content with LLMs and introduce Persuaficial, a high-quality multilingual benchmark covering six languages: English, German, Polish, Italian, French and Russian. Using this benchmark, we conduct extensive empirical evaluations comparing human-authored and LLM-generated persuasive texts. We find that although overtly persuasive LLM-generated texts can be easier to detect than human-written ones, subtle LLM-generated persuasion consistently degrades automatic detection performance. Beyond detection performance, we provide the first comprehensive linguistic analysis contrasting human and LLM-generated persuasive texts, offering insights that may guide the development of more interpretable and robust detection tools.
翻译:大型语言模型(LLMs)能够生成极具说服力的文本,这引发了对其被滥用于宣传、操纵及其他有害目的的担忧。由此引出我们的核心问题:LLM生成的劝说性内容是否比人类撰写的劝说性内容更难自动检测?为回答这一问题,我们对利用LLMs生成劝说性内容的可控方法进行了分类,并推出了Persuaficial——一个覆盖英语、德语、波兰语、意大利语、法语和俄语六种语言的高质量多语言基准。基于该基准,我们开展了广泛的实证评估,比较了人类撰写的与LLM生成的劝说性文本。研究发现,尽管显式劝说性的LLM生成文本比人类撰写文本更易检测,但微妙劝说性的LLM生成文本始终会降低自动检测性能。除检测性能外,我们还首次提供了对比人类与LLM劝说性文本的全面语言学分析,其洞见有助于指导开发更具可解释性与鲁棒性的检测工具。