An important factor when it comes to generating fact-checking explanations is the selection of evidence: intuitively, high-quality explanations can only be generated given the right evidence. In this work, we investigate the impact of human-curated vs. machine-selected evidence for explanation generation using large language models. To assess the quality of explanations, we focus on transparency (whether an explanation cites sources properly) and utility (whether an explanation is helpful in clarifying a claim). Surprisingly, we found that large language models generate similar or higher quality explanations using machine-selected evidence, suggesting carefully curated evidence (by humans) may not be necessary. That said, even with the best model, the generated explanations are not always faithful to the sources, suggesting further room for improvement in explanation generation for fact-checking.
翻译:生成事实核查解释时,证据选择是一个关键因素:直观而言,只有基于恰当证据才能生成高质量解释。本研究利用大语言模型,探究人工筛选证据与机器选择证据对解释生成的影响。为评估解释质量,我们重点关注透明度(解释是否恰当引用来源)与实用性(解释是否有助于澄清主张)。令人惊讶的是,我们发现大语言模型使用机器选择证据时,能生成质量相当或更高的解释,这表明精心的人工证据筛选可能并非必需。尽管如此,即使采用最佳模型,生成的解释仍不能始终忠实于原始来源,这表明事实核查解释生成领域仍有进一步改进空间。