Lexical Substitution discovers appropriate substitutes for a given target word in a context sentence. However, the task fails to consider substitutes that are of equal or higher proficiency than the target, an aspect that could be beneficial for language learners looking to improve their writing. To bridge this gap, we propose a new task, language proficiency-oriented lexical substitution. We also introduce ProLex, a novel benchmark designed to assess systems' ability to generate not only appropriate substitutes but also substitutes that demonstrate better language proficiency. Besides the benchmark, we propose models that can automatically perform the new task. We show that our best model, a Llama2-13B model fine-tuned with task-specific synthetic data, outperforms ChatGPT by an average of 3.2% in F-score and achieves comparable results with GPT-4 on ProLex.
翻译:摘要:词汇替换旨在发现上下文句子中给定目标词的合适替代词。然而,该任务未能考虑那些与目标词具有同等或更高熟练度的替代词,这一方面可能对希望提升写作水平的语言学习者有益。为弥合这一差距,我们提出了一项新任务——面向语言熟练度的词汇替换。同时,我们引入了ProLex,一个旨在评估系统生成不仅合适、而且能展现更高语言熟练度的替代词能力的新型基准。除基准外,我们还提出了能够自动执行该新任务的模型。实验表明,我们最佳模型——基于任务特定合成数据微调的Llama2-13B模型,在F值上平均超过ChatGPT 3.2%,并在ProLex上取得了与GPT-4相当的结果。