In the era dominated by information overload and its facilitation with Large Language Models (LLMs), the prevalence of misinformation poses a significant threat to public discourse and societal well-being. A critical concern at present involves the identification of machine-generated news. In this work, we take a significant step by introducing a benchmark dataset designed for neural news detection in four languages: English, Turkish, Hungarian, and Persian. The dataset incorporates outputs from multiple multilingual generators (in both, zero-shot and fine-tuned setups) such as BloomZ, LLaMa-2, Mistral, Mixtral, and GPT-4. Next, we experiment with a variety of classifiers, ranging from those based on linguistic features to advanced Transformer-based models and LLMs prompting. We present the detection results aiming to delve into the interpretablity and robustness of machine-generated texts detectors across all target languages.
翻译:在信息过载及其由大型语言模型(LLMs)推动的时代,虚假信息的盛行对公共讨论和社会福祉构成了重大威胁。当前的一个关键关切在于识别机器生成的新闻。本工作中,我们迈出了重要一步,引入了一个为四种语言(英语、土耳其语、匈牙利语和波斯语)设计的神经新闻检测基准数据集。该数据集包含了来自多个多语言生成器(包括零样本和微调设置)的输出,例如 BloomZ、LLaMa-2、Mistral、Mixtral 和 GPT-4。接着,我们试验了多种分类器,从基于语言特征的分类器到先进的基于 Transformer 的模型以及 LLMs 提示方法。我们展示了检测结果,旨在深入探究所有目标语言中机器生成文本检测器的可解释性和鲁棒性。