The rapid spread of misinformation on social media platforms has become a formidable challenge. To mitigate its proliferation, Misinformation Detection (MD) has emerged as a critical research topic. Traditional MD approaches based on small models typically perform binary classification through a black-box process. Recently, the rise of Large Language Models (LLMs) has enabled explainable MD, where models generate rationales that explain their decisions, thereby enhancing transparency. Existing explainable MD methods primarily focus on crafting sophisticated prompts to elicit rationales from off-the-shelf LLMs. In this work, we propose a pipeline to fine-tune a dedicated LLM specifically for explainable MD. Our pipeline begins by collecting large-scale fact-checked articles, and then uses multiple strong LLMs to produce veracity predictions and rationales. To ensure high-quality training data, we leverage a filtering strategy that selects only the correct instances for fine-tuning. While this pipeline is intuitive and prevalent, our experiments reveal that naive filtering based solely on label correctness is insufficient in practice and suffers from two critical limitations: (1) Coarse-grained labels cause insufficient rationales: Rationales filtered solely based on binary labels are insufficient to adequately support their decisions; (2) Over-verification behavior causes unnecessary rationales: Stronger LLMs tend to exhibit over-verification behavior, producing excessively verbose and unnecessary rationales. To address these issues, we introduce LONSREX, a novel data synthesis pipeline to Locate Necessary and Sufficient Rationales for Explainable MD. Specifically, we propose a metric that quantifies the contribution of each verification step to the final prediction, thereby evaluating its necessity and sufficiency. Experimental results demonstrate the effectiveness of LONSREX.
翻译:[translated abstract in Chinese]
社交媒体平台上虚假信息的迅速传播已成为一个严峻挑战。为遏制其扩散,虚假信息检测(MD)已成为一项关键研究课题。基于小模型的传统MD方法通常通过黑箱过程执行二元分类。近来,大语言模型(LLMs)的兴起实现了可解释性MD,即模型生成解释其决策的理由,从而提升透明度。现有可解释性MD方法主要聚焦于设计精巧的提示词,以从现成的大语言模型中引出理由。本研究提出了一种微调专用大语言模型以执行可解释性MD的流程。该流程首先收集大规模事实核查文章,随后利用多个强大语言模型生成真实性预测及其理由。为确保训练数据质量,我们采用过滤策略仅选择正确实例进行微调。尽管该流程直观且普遍适用,但实验表明,仅基于标签正确性的朴素过滤在实践中效果不足,并存在两个关键缺陷:(1)粗粒度标签导致理由不充分:仅基于二元标签过滤的理由难以充分支持其决策;(2)过度验证行为导致冗余理由:强大语言模型易出现过度验证行为,生成冗长且不必要的理由。针对上述问题,我们提出LONSREX——一种用于定位可解释性MD中必要且充分理由的新型数据合成流程。具体而言,我们设计了量化每个验证步骤对最终预测贡献程度的指标,从而评估其必要性与充分性。实验结果表明了LONSREX的有效性。