IAI Group at CheckThat! 2024: Transformer Models and Data Augmentation for Checkworthy Claim Detection

This paper describes IAI group's participation for automated check-worthiness estimation for claims, within the framework of the 2024 CheckThat! Lab "Task 1: Check-Worthiness Estimation". The task involves the automated detection of check-worthy claims in English, Dutch, and Arabic political debates and Twitter data. We utilized various pre-trained generative decoder and encoder transformer models, employing methods such as few-shot chain-of-thought reasoning, fine-tuning, data augmentation, and transfer learning from one language to another. Despite variable success in terms of performance, our models achieved notable placements on the organizer's leaderboard: ninth-best in English, third-best in Dutch, and the top placement in Arabic, utilizing multilingual datasets for enhancing the generalizability of check-worthiness detection. Despite a significant drop in performance on the unlabeled test dataset compared to the development test dataset, our findings contribute to the ongoing efforts in claim detection research, highlighting the challenges and potential of language-specific adaptations in claim verification systems.

翻译：本文介绍了IAI团队在2024年CheckThat!实验室"任务1：可核查性评估"框架下，针对主张自动可核查性评估的参与工作。该任务涉及对英语、荷兰语和阿拉伯语政治辩论及Twitter数据中可核查主张的自动检测。我们采用了多种预训练的生成式解码器与编码器Transformer模型，运用了少样本思维链推理、微调、数据增强以及跨语言迁移学习等方法。尽管在性能表现上存在波动，我们的模型在组织方排行榜上取得了显著位次：在英语任务中位列第九，在荷兰语任务中位列第三，在阿拉伯语任务中位列第一，其中利用了多语言数据集以增强可核查性检测的泛化能力。尽管在未标注测试数据集上的性能相比开发测试数据集出现显著下降，我们的研究结果仍为持续的主张检测研究提供了贡献，凸显了主张验证系统中语言特定适配所面临的挑战与潜力。

相关内容

GROUP

关注 1

Group一直是研究计算机支持的合作工作、人机交互、计算机支持的协作学习和社会技术研究的主要场所。该会议将社会科学、计算机科学、工程、设计、价值观以及其他与小组工作相关的多个不同主题的工作结合起来，并进行了广泛的概念化。官网链接：https://group.acm.org/conferences/group20/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日