Assessing the veracity of a claim made online is a complex and important task with real-world implications. When these claims are directed at communities with limited access to information and the content concerns issues such as healthcare and culture, the consequences intensify, especially in low-resource languages. In this work, we introduce AfrIFact, a dataset that covers the necessary steps for automatic fact-checking (i.e., information retrieval, evidence extraction, and fact checking), in ten African languages and English. Our evaluation results show that even the best embedding models lack cross-lingual retrieval capabilities, and that cultural and news documents are easier to retrieve than healthcare-domain documents, both in large corpora and in single documents. We show that LLMs lack robust multilingual fact-verification capabilities in African languages, while few-shot prompting improves performance by up to 43% in AfriqueQwen-14B, and task-specific fine-tuning further improves fact-checking accuracy by up to 26%. These findings, along with our release of the AfrIFact dataset, encourage work on low-resource information retrieval, evidence retrieval, and fact checking.
翻译:评估网络言论的真实性是一项复杂且具有实际意义的重要任务。当这些言论针对信息获取受限的社区,且内容涉及医疗保健与文化等议题时,其后果尤为严重,尤其是在低资源语言环境中。本研究提出了AfrIFact数据集,涵盖自动事实核查的必要步骤(即信息检索、证据提取与事实核查),涉及十种非洲语言及英语。评估结果表明,即使是最优的嵌入模型也缺乏跨语言检索能力;在大规模语料库与单文档场景中,文化与新闻类文档比医疗领域文档更易检索。我们发现,大型语言模型在非洲语言的多语言事实验证能力上存在显著不足,而少样本提示可将AfriqueQwen-14B的性能提升高达43%,任务特定微调可进一步将事实核查准确率提升26%。这些发现,连同我们发布的AfrIFact数据集,将推动低资源信息检索、证据提取与事实核查领域的研究。