In this paper we analyze features to classify human- and AI-generated text for English, French, German and Spanish and compare them across languages. We investigate two scenarios: (1) The detection of text generated by AI from scratch, and (2) the detection of text rephrased by AI. For training and testing the classifiers in this multilingual setting, we created a new text corpus covering 10 topics for each language. For the detection of AI-generated text, the combination of all proposed features performs best, indicating that our features are portable to other related languages: The F1-scores are close with 99% for Spanish, 98% for English, 97% for German and 95% for French. For the detection of AI-rephrased text, the systems with all features outperform systems with other features in many cases, but using only document features performs best for German (72%) and Spanish (86%) and only text vector features leads to best results for English (78%).
翻译:本文分析了用于分类英语、法语、德语和西班牙语中人类与AI生成文本的特征,并跨语言进行了比较。我们研究了两种场景: (1) 从头开始由AI生成的文本检测,以及 (2) AI改写文本的检测。为了在此多语言环境下训练和测试分类器,我们创建了一个新的文本语料库,涵盖每种语言的10个主题。对于AI生成文本的检测,所有提出特征的组合表现最佳,表明我们的特征可迁移至其他相关语言:F1分数接近,西班牙语为99%,英语为98%,德语为97%,法语为95%。对于AI改写文本的检测,结合所有特征的系统在多数情况下优于其他特征系统,但仅使用文档特征对德语(72%)和西班牙语(86%)效果最佳,而仅使用文本向量特征对英语(78%)取得最优结果。