This paper presents the MasonTigers entry to the SemEval-2024 Task 8 - Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection. The task encompasses Binary Human-Written vs. Machine-Generated Text Classification (Track A), Multi-Way Machine-Generated Text Classification (Track B), and Human-Machine Mixed Text Detection (Track C). Our best performing approaches utilize mainly the ensemble of discriminator transformer models along with sentence transformer and statistical machine learning approaches in specific cases. Moreover, zero-shot prompting and fine-tuning of FLAN-T5 are used for Track A and B.
翻译:本文介绍了MasonTigers团队参与SemEval-2024任务8——多生成器、多领域、多语言黑盒机器生成文本检测的情况。该任务涵盖二分类人工撰写文本与机器生成文本检测(Track A)、多类别机器生成文本分类(Track B)以及人机混合文本检测(Track C)。我们表现最佳的方法主要采用判别式Transformer模型的集成,并结合特定场景下的句子Transformer与统计机器学习方法。此外,针对Track A和Track B,我们使用了FLAN-T5的零样本提示与微调策略。