Large language models (LLMs) have reached human-like proficiency in generating diverse textual content, underscoring the necessity for effective fake text detection to avoid potential risks such as fake news in social media. Previous research has mostly tested single models on in-distribution datasets, limiting our understanding of how these models perform on different types of data for LLM-generated text detection task. We researched this by testing five specialized transformer-based models on both in-distribution and out-of-distribution datasets to better assess their performance and generalizability. Our results revealed that single transformer-based classifiers achieved decent performance on in-distribution dataset but limited generalization ability on out-of-distribution dataset. To improve it, we combined the individual classifiers models using adaptive ensemble algorithms, which improved the average accuracy significantly from 91.8% to 99.2% on an in-distribution test set and from 62.9% to 72.5% on an out-of-distribution test set. The results indicate the effectiveness, good generalization ability, and great potential of adaptive ensemble algorithms in LLM-generated text detection.
翻译:大语言模型(LLM)已具备类人水平的多样化文本生成能力,这使得有效检测虚假文本成为必要,以避免社交媒体中虚假新闻等潜在风险。此前研究多局限于在分布内数据集上测试单一模型,限制了我们对这些模型在不同类型数据上执行LLM生成文本检测任务表现的理解。我们通过在分布内和分布外数据集上测试五种基于Transformer的专用模型,更全面地评估其性能与泛化能力。结果表明,单一Transformer分类器在分布内数据集上表现尚可,但在分布外数据集上泛化能力有限。为提升性能,我们采用自适应集成算法结合各分类器模型,使分布内测试集平均准确率从91.8%显著提升至99.2%,分布外测试集从62.9%提升至72.5%。结果证明了自适应集成算法在LLM生成文本检测中的有效性、良好泛化能力及巨大潜力。