Lexical Substitution discovers appropriate substitutes for a given target word in a context sentence. However, the task fails to consider substitutes that are of equal or higher proficiency than the target, an aspect that could be beneficial for language learners looking to improve their writing. To bridge this gap, we propose a new task, language proficiency-oriented lexical substitution. We also introduce ProLex, a novel benchmark designed to assess systems' ability to generate not only appropriate substitutes but also substitutes that demonstrate better language proficiency. Besides the benchmark, we propose models that can automatically perform the new task. We show that our best model, a Llama2-13B model fine-tuned with task-specific synthetic data, outperforms ChatGPT by an average of 3.2% in F-score and achieves comparable results with GPT-4 on ProLex.
翻译:词汇替换任务旨在发现上下文句子中给定目标词的合适替换词。然而,该任务未能考虑与目标词具有同等或更高熟练度的替换词——这一方面对于希望提升写作水平的语言学习者而言可能具有实用价值。为弥补这一不足,我们提出一项新任务:面向语言熟练度的词汇替换。同时推出ProLex基准,该创新性基准旨在评估系统生成不仅恰当、且能展现更高语言熟练度的替换词的能力。除基准外,我们还提出可自动执行该新任务的模型。实验表明,我们最优模型(基于任务特定合成数据微调的Llama2-13B模型)在F值上平均超越ChatGPT 3.2%,且在ProLex基准上达到与GPT-4相当的效果。