This paper describes a system designed to distinguish between AI-generated and human-written scientific excerpts in the DAGPap24 competition hosted within the Fourth Workshop on Scientific Document Processing. In this competition the task is to find artificially generated token-level text fragments in documents of a scientific domain. Our work focuses on the use of a multi-task learning architecture with two heads. The application of this approach is justified by the specificity of the task, where class spans are continuous over several hundred characters. We considered different encoder variations to obtain a state vector for each token in the sequence, as well as a variation in splitting fragments into tokens to further feed into the input of a transform-based encoder. This approach allows us to achieve a 9% quality improvement relative to the baseline solution score on the development set (from 0.86 to 0.95) using the average macro F1-score, as well as a score of 0.96 on a closed test part of the dataset from the competition.
翻译:本文描述了一个旨在第四届科学文档处理研讨会中举办的DAGPap24竞赛中区分AI生成与人工撰写的科学文本片段的系统。该竞赛的任务是在科学领域文档中检测人工生成的词元级文本片段。我们的工作重点在于采用具有双任务头的多任务学习架构。这种方法的适用性源于任务本身的特殊性——类别跨度在数百个字符范围内连续分布。我们考虑了不同的编码器变体来获取序列中每个词元的状态向量,同时探索了将片段分割为词元以输入基于Transformer的编码器的不同方案。该方法使我们在开发集上相对于基线解决方案取得了9%的质量提升(平均宏观F1分数从0.86提高至0.95),并在竞赛数据集的封闭测试部分获得了0.96的分数。