Automatic assessment of the quality of arguments has been recognized as a challenging task with significant implications for misinformation and targeted speech. While real-world arguments are tightly anchored in context, existing computational methods analyze their quality in isolation, which affects their accuracy and generalizability. We propose SPARK: a novel method for scoring argument quality based on contextualization via relevant knowledge. We devise four augmentations that leverage large language models to provide feedback, infer hidden assumptions, supply a similar-quality argument, or give a counter-argument. SPARK uses a dual-encoder Transformer architecture to enable the original argument and its augmentation to be considered jointly. Our experiments in both in-domain and zero-shot setups show that SPARK consistently outperforms existing techniques across multiple metrics.
翻译:自动评估论证质量已被视为一项具有挑战性的任务,其对于虚假信息和针对性言论具有重要影响。尽管现实世界中的论证紧密依赖于具体语境,但现有的计算方法却孤立地分析其质量,这影响了评估的准确性和泛化能力。我们提出SPARK:一种基于情境化(通过相关知识)的论证质量评分新方法。我们设计了四种增强策略,利用大型语言模型提供反馈、推断隐含假设、提供类似质量的论证或给出反论证。SPARK采用双编码器Transformer架构,使原始论证及其增强内容能够被联合考量。在领域内和零样本设置下的实验表明,SPARK在多个指标上持续优于现有技术。