The rise of digital misinformation has heightened interest in using multilingual Large Language Models (LLMs) for fact-checking. This study systematically evaluates translation bias and the effectiveness of LLMs for cross-lingual claim verification across 15 languages from five language families: Romance, Slavic, Turkic, Indo-Aryan, and Kartvelian. Using the XFACT dataset to assess their impact on accuracy and bias, we investigate two distinct translation methods: pre-translation and self-translation. We use mBERT's performance on the English dataset as a baseline to compare language-specific accuracies. Our findings reveal that low-resource languages exhibit significantly lower accuracy in direct inference due to underrepresentation in the training data. Furthermore, larger models demonstrate superior performance in self-translation, improving translation accuracy and reducing bias. These results highlight the need for balanced multilingual training, especially in low-resource languages, to promote equitable access to reliable fact-checking tools and minimize the risk of spreading misinformation in different linguistic contexts.
翻译:数字虚假信息的兴起增强了利用多语言大语言模型进行事实核查的关注度。本研究系统评估了翻译偏差及大语言模型在跨语言声明验证中的有效性,涵盖来自罗曼语族、斯拉夫语族、突厥语族、印度-雅利安语族和卡特维尔语族五个语系的15种语言。通过使用XFACT数据集评估其对准确性和偏差的影响,我们研究了两种不同的翻译方法:预翻译与自翻译。我们以mBERT在英文数据集上的表现为基线,比较各语言的具体准确性。研究发现,由于训练数据中代表性不足,低资源语言在直接推理中表现出显著较低的准确性。此外,更大规模的模型在自翻译中展现出更优性能,提高了翻译准确性并减少了偏差。这些结果凸显了平衡多语言训练的必要性,特别是在低资源语言中,以促进可靠事实核查工具的公平获取,并降低在不同语言环境中传播虚假信息的风险。