一个LLM训练所有任务：用于事实核查的多任务学习框架 (One LLM to Train Them All: Multi-Task Learning Framework for Fact-Checking)

Large language models (LLMs) are reshaping automated fact-checking (AFC) by enabling unified, end-to-end verification pipelines rather than isolated components. While large proprietary models achieve strong performance, their closed weights, complexity, and high costs limit sustainability. Fine-tuning smaller open weight models for individual AFC tasks can help but requires multiple specialized models resulting in high costs. We propose \textbf{multi-task learning (MTL)} as a more efficient alternative that fine-tunes a single model to perform claim detection, evidence ranking, and stance detection jointly. Using small decoder-only LLMs (e.g., Qwen3-4b), we explore three MTL strategies: classification heads, causal language modeling heads, and instruction-tuning, and evaluate them across model sizes, task orders, and standard non-LLM baselines. While multitask models do not universally surpass single-task baselines, they yield substantial improvements, achieving up to \textbf{44\%}, \textbf{54\%}, and \textbf{31\%} relative gains for claim detection, evidence re-ranking, and stance detection, respectively, over zero-/few-shot settings. Finally, we also provide practical, empirically grounded guidelines to help practitioners apply MTL with LLMs for automated fact-checking.

翻译：大型语言模型（LLM）正在重塑自动化事实核查（AFC），通过实现统一、端到端的验证流程，而非孤立的组件。虽然大型专有模型实现了强大的性能，但其封闭的权重、复杂性和高成本限制了可持续性。为单个AFC任务微调较小的开放权重模型可能有所帮助，但需要多个专用模型，导致成本高昂。我们提出**多任务学习（MTL）**作为一种更高效的替代方案，它通过微调单个模型来联合执行主张检测、证据排序和立场检测。使用小型仅解码器LLM（例如Qwen3-4b），我们探索了三种MTL策略：分类头、因果语言建模头和指令微调，并在模型大小、任务顺序和标准的非LLM基线模型上对它们进行了评估。虽然多任务模型并未普遍超越单任务基线，但它们带来了显著的改进，在零样本/少样本设置下，分别为主张检测、证据重排序和立场检测实现了高达**44%**、**54%**和**31%**的相对增益。最后，我们还提供了基于实证的实用指南，以帮助从业者将LLM与MTL应用于自动化事实核查。

相关内容

多任务学习

关注 162

多任务学习（MTL）是机器学习的一个子领域，可以同时解决多个学习任务，同时利用各个任务之间的共性和差异。与单独训练模型相比，这可以提高特定任务模型的学习效率和预测准确性。多任务学习是归纳传递的一种方法，它通过将相关任务的训练信号中包含的域信息用作归纳偏差来提高泛化能力。通过使用共享表示形式并行学习任务来实现,每个任务所学的知识可以帮助更好地学习其它任务。

大型语言模型（LLM）智能体全栈安全的综述：数据、训练与部署

专知会员服务

32+阅读 · 2025年4月23日

LLM后训练：深入探讨推理大语言模型

专知会员服务

40+阅读 · 2025年3月3日

利用多个大型语言模型：关于LLM集成的调研

专知会员服务

35+阅读 · 2025年2月27日

从基础到突破的LLM微调终极指南：技术、研究、最佳实践、应用研究挑战与机遇的全面综述

专知会员服务

56+阅读 · 2024年11月17日