We introduce the submissions of the NJUNLP team to the WMT 2023 Quality Estimation (QE) shared task. Our team submitted predictions for the English-German language pair on all two sub-tasks: (i) sentence- and word-level quality prediction; and (ii) fine-grained error span detection. This year, we further explore pseudo data methods for QE based on NJUQE framework (https://github.com/NJUNLP/njuqe). We generate pseudo MQM data using parallel data from the WMT translation task. We pre-train the XLMR large model on pseudo QE data, then fine-tune it on real QE data. At both stages, we jointly learn sentence-level scores and word-level tags. Empirically, we conduct experiments to find the key hyper-parameters that improve the performance. Technically, we propose a simple method that covert the word-level outputs to fine-grained error span results. Overall, our models achieved the best results in English-German for both word-level and fine-grained error span detection sub-tasks by a considerable margin.
翻译:我们介绍了NJUNLP团队在WMT 2023质量估计(QE)共享任务中的提交成果。本团队针对英语-德语语言对提交了两个子任务的预测结果:(i) 句子级和词级质量预测;(ii) 细粒度错误跨度检测。今年,我们基于NJUQE框架(https://github.com/NJUNLP/njuqe)进一步探索了用于QE的伪数据方法。我们利用WMT翻译任务的平行数据生成伪MQM数据,先在伪QE数据上预训练XLMR大型模型,再在真实QE数据上进行精调。在两个阶段中,我们联合学习句子级分数和词级标签。通过实验,我们找到了提升性能的关键超参数。技术上,我们提出了一种将词级输出转换为细粒度错误跨度结果的简单方法。总体而言,我们的模型在英语-德语词级和细粒度错误跨度检测两个子任务上均以显著优势取得了最佳结果。