Fallacies can be used to spread disinformation, fake news, and propaganda, underlining the importance of their detection. Automated detection and classification of fallacies, however, remain challenging, mainly because of the innate subjectivity of the task and the need for a comprehensive, unified approach in existing research. Addressing these limitations, our study introduces a novel taxonomy of fallacies that aligns and refines previous classifications, a new annotation scheme tailored for subjective NLP tasks, and a new evaluation method designed to handle subjectivity, adapted to precision, recall, and F1-Score metrics. Using our annotation scheme, the paper introduces MAFALDA (Multi-level Annotated FALlacy DAtaset), a gold standard dataset. MAFALDA is based on examples from various previously existing fallacy datasets under our unified taxonomy across three levels of granularity. We then evaluate several language models under a zero-shot learning setting using MAFALDA to assess their fallacy detection and classification capability. Our comprehensive evaluation not only benchmarks the performance of these models but also provides valuable insights into their strengths and limitations in addressing fallacious reasoning.
翻译:谬误可被用于传播虚假信息、假新闻和宣传,凸显了其检测的重要性。然而,谬误的自动检测与分类仍具挑战性,这主要源于任务固有的主观性以及现有研究中缺乏全面统一的处理方法。为应对这些局限,本研究提出了:一种对齐并细化先前分类的新型谬误分类体系;专门针对主观性NLP任务设计的新标注方案;以及一种为处理主观性而适配精确率、召回率与F1分数指标的评估方法。基于我们的标注方案,本文介绍了多层级标注谬误数据集(MAFALDA),这是一个黄金标准数据集。MAFALDA基于先前多种谬误数据集中的示例,并采用统一分类体系在三个粒度层级进行组织。我们随后在零样本学习设置下,利用MAFALDA评估了多种语言模型的谬误检测与分类能力。这项综合评估不仅建立了这些模型的性能基准,还深入揭示了它们在处理谬误推理方面的优势与局限。