How is a factual claim made credible? We propose the novel task of Epistemic Appeal Identification, which identifies whether and how factual statements have been anchored by external sources or evidence. To advance research on this task, we present FactAppeal, a manually annotated dataset of 3,226 English-language news sentences. Unlike prior resources that focus solely on claim detection and verification, FactAppeal identifies the nuanced epistemic structures and evidentiary basis underlying these claims and used to support them. FactAppeal contains span-level annotations which identify factual statements and mentions of sources on which they rely. Moreover, the annotations include fine-grained characteristics of factual appeals such as the type of source (e.g. Active Participant, Witness, Expert, Direct Evidence), whether it is mentioned by name, mentions of the source's role and epistemic credentials, attribution to the source via direct or indirect quotation, and other features. We model the task with a range of encoder models and generative decoder models in the 2B-9B parameter range. Our best performing model, based on Gemma 2 9B, achieves a macro-F1 score of 0.73.
翻译:摘要:事实主张如何建立可信度?我们提出了认知性诉求识别这一新任务,旨在识别事实陈述是否以及如何被外部来源或证据所锚定。为推进该任务研究,我们构建了FactAppeal,一个由3226条英语新闻句子构成的人工标注数据集。与仅关注主张检测与验证的现有资源不同,FactAppeal识别这些主张背后及其所依赖的细微认知结构与证据基础。该数据集包含跨度级标注,可识别事实陈述及其所依托的来源提及。此外,标注内容涵盖事实诉求的细粒度特征,例如来源类型(如主动参与者、目击者、专家、直接证据)、是否具名提及、来源角色与认知资质的指涉、通过直接或间接引用对来源的归属,以及其他属性。我们采用参数范围在2B-9B之间的多种编码器模型与生成式解码器模型对该任务进行建模。基于Gemma 2 9B的最佳模型取得了0.73的宏F1分数。