Measuring the relative impact of CTs is important for prioritizing responses and allocating resources effectively, especially during crises. However, assessing the actual impact of CTs on the public poses unique challenges. It requires not only the collection of CT-specific knowledge but also diverse information from social, psychological, and cultural dimensions. Recent advancements in large language models (LLMs) suggest their potential utility in this context, not only due to their extensive knowledge from large training corpora but also because they can be harnessed for complex reasoning. In this work, we develop datasets of popular CTs with human-annotated impacts. Borrowing insights from human impact assessment processes, we then design tailored strategies to leverage LLMs for performing human-like CT impact assessments. Through rigorous experiments, we textit{discover that an impact assessment mode using multi-step reasoning to analyze more CT-related evidence critically produces accurate results; and most LLMs demonstrate strong bias, such as assigning higher impacts to CTs presented earlier in the prompt, while generating less accurate impact assessments for emotionally charged and verbose CTs.
翻译:衡量阴谋论的相对影响对于有效优先响应和分配资源至关重要,尤其在危机期间。然而,评估阴谋论对公众的实际影响存在独特挑战。这不仅需要收集阴谋论相关的特定知识,还需要来自社会、心理和文化维度的多样化信息。大语言模型(LLMs)的最新进展表明其在此背景下具有潜在应用价值,这不仅源于其从大规模训练语料中获取的广泛知识,还因为它们可被用于复杂推理。在本研究中,我们构建了包含人类标注影响程度的热门阴谋论数据集。借鉴人类影响评估过程的思路,我们随后设计了定制化策略,以利用LLMs执行类人的阴谋论影响评估。通过严谨的实验,我们发现:采用多步推理模式批判性分析更多阴谋论相关证据时,能产生准确结果;而大多数LLMs表现出强烈偏见,例如对提示中较早出现的阴谋论赋予更高影响程度,同时对情绪化且冗长的阴谋论生成的影响评估准确性较低。