Unsupported and unfalsifiable claims we encounter in our daily lives can influence our view of the world. Characterizing, summarizing, and -- more generally -- making sense of such claims, however, can be challenging. In this work, we focus on fine-grained debate topics and formulate a new task of distilling, from such claims, a countable set of narratives. We present a crowdsourced dataset of 12 controversial topics, comprising more than 120k arguments, claims, and comments from heterogeneous sources, each annotated with a narrative label. We further investigate how large language models (LLMs) can be used to synthesise claims using In-Context Learning. We find that generated claims with supported evidence can be used to improve the performance of narrative classification models and, additionally, that the same model can infer the stance and aspect using a few training examples. Such a model can be useful in applications which rely on narratives , e.g. fact-checking.
翻译:在日常生活中,我们遇到的无支持且无法证伪的主张可能影响我们的世界观。然而,对这些主张进行特征化、总结乃至更广泛地理解,往往充满挑战。本研究聚焦于细粒度辩论议题,提出一项新任务:从这类主张中提炼出可计数的叙事集合。我们构建了一个包含12个争议性话题的众包数据集,涵盖来自异构来源的超过12万条论点、主张和评论,每条数据均标注了叙事标签。我们进一步探究如何利用大型语言模型(LLMs)通过上下文学习合成主张。研究发现,包含支持证据的生成性主张可提升叙事分类模型的性能,同时,同一模型能通过少量训练示例推断立场与维度。这类模型在依赖于叙事(如事实核查)的应用中具有实用价值。