Recognizing fallacies is crucial for ensuring the quality and validity of arguments across various domains. However, computational fallacy recognition faces challenges due to the diverse genres, domains, and types of fallacies found in datasets. This leads to a highly multi-class, and even multi-label, setup with substantial class imbalance. In this study, we aim to enhance existing models for fallacy recognition by incorporating additional context and by leveraging large language models to generate synthetic data, thus increasing the representation of the infrequent classes. We experiment with GPT3.5 to generate synthetic examples and we examine the impact of prompt settings for this. Moreover, we explore zero-shot and few-shot scenarios to evaluate the effectiveness of using the generated examples for training smaller models within a unified fallacy recognition framework. Furthermore, we analyze the overlap between the synthetic data and existing fallacy datasets. Finally, we investigate the usefulness of providing supplementary context for detecting fallacy types that need such context, e.g., diversion fallacies. Our evaluation results demonstrate consistent improvements across fallacy types, datasets, and generators. The code and the synthetic datasets are all publicly available.
翻译:谬误识别对于确保各领域论证的质量与有效性至关重要。然而,由于数据集中谬误的体裁、领域和类型多样,计算谬误识别面临挑战。这导致了高度多类别甚至多标签的设置,且存在显著的类别不平衡问题。在本研究中,我们旨在通过引入额外上下文并利用大型语言模型生成合成数据来增强现有的谬误识别模型,从而提升低频类别的代表性。我们使用GPT3.5生成合成示例,并探究提示设置对此的影响。此外,我们探索了零样本和少样本场景,以评估在统一的谬误识别框架中使用生成示例训练较小模型的有效性。进一步地,我们分析了合成数据与现有谬误数据集之间的重叠性。最后,我们研究了为需要特定上下文的谬误类型(如转移注意类谬误)提供补充上下文的有效性。评估结果表明,该方法在谬误类型、数据集和生成器方面均取得了持续改进。相关代码与合成数据集均已公开。