人们想要核实什么？事实核查的需求侧分析 (What do people want to fact-check?)

Research on misinformation has focused almost exclusively on supply, asking what falsehoods circulate, who produces them, and whether corrections work. A basic demand-side question remains unanswered. When ordinary people can fact-check anything they want, what do they actually ask about? We provide the first large-scale evidence on this question by analyzing close to 2{,}500 statements submitted by 457 participants to an open-ended AI fact-checking system. Each claim is classified along five semantic dimensions (domain, epistemic form, verifiability, target entity, and temporal reference), producing a behavioral map of public verification demand. Three findings stand out. First, users range widely across topics but default to a narrow epistemic repertoire, overwhelmingly submitting simple descriptive claims about present-day observables. Second, roughly one in four requests concerns statements that cannot be empirically resolved, including moral judgments, speculative predictions, and subjective evaluations, revealing a systematic mismatch between what users seek from fact-checking tools and what such tools can deliver. Third, comparison with the FEVER benchmark dataset exposes sharp structural divergences across all five dimensions, indicating that standard evaluation corpora encode a synthetic claim environment that does not resemble real-world verification needs. These results reframe fact-checking as a demand-driven problem and identify where current AI systems and benchmarks are misaligned with the uncertainty people actually experience.

翻译：关于错误信息的研究几乎完全集中在供给侧，探讨了哪些虚假信息在传播、谁在制造这些信息以及纠正措施是否有效。一个基本的需求侧问题仍未得到解答：当普通人可以核查任何他们想核实的内容时，他们实际会提出哪些问题？我们通过分析457名参与者向一个开放式AI事实核查系统提交的近2500条陈述，首次为此问题提供了大规模实证证据。每条声明均按五个语义维度（领域、认知形式、可验证性、目标实体和时间参照）进行分类，从而绘制出公众核查需求的行为图谱。三个主要发现尤为突出：首先，用户涉及的话题范围广泛，但默认采用狭窄的认知模式——绝大多数提交的是关于当下可观测事物的简单描述性主张。其次，约四分之一的查询涉及无法通过实证验证的陈述，包括道德判断、推测性预测和主观评价，这揭示了用户对事实核查工具的期待与工具实际能力之间的系统性错配。第三，与FEVER基准数据集的对比显示，所有五个维度均存在显著的结构性差异，表明标准评估语料库编码的是与现实世界核查需求不符的合成声明环境。这些结果将事实核查重新定义为需求驱动的问题，并指出当前AI系统和基准测试与人们实际面临的不确定性之间存在哪些错位。