Lexical substitution (LS) aims at finding appropriate substitutes for a target word in a sentence. Recently, LS methods based on pretrained language models have made remarkable progress, generating potential substitutes for a target word through analysis of its contextual surroundings. However, these methods tend to overlook the preservation of the sentence's meaning when generating the substitutes. This study explores how to generate the substitute candidates from a paraphraser, as the generated paraphrases from a paraphraser contain variations in word choice and preserve the sentence's meaning. Since we cannot directly generate the substitutes via commonly used decoding strategies, we propose two simple decoding strategies that focus on the variations of the target word during decoding. Experimental results show that our methods outperform state-of-the-art LS methods based on pre-trained language models on three benchmarks.
翻译:摘要:词汇替换(LS)旨在为句子中的目标词找到合适的替代词。近年来,基于预训练语言模型的LS方法取得了显著进展,通过分析目标词的上下文语境生成潜在的替代词。然而,这些方法在生成替代词时往往忽略了句子语义的保持。本研究探索如何从释义器中生成候选替代词,因为释义器生成的释义既包含词汇选择的变化,又保留了句子含义。由于无法直接通过常用解码策略生成替代词,我们提出了两种专注于解码过程中目标词变化的简单解码策略。实验结果表明,我们的方法在三个基准数据集上均优于基于预训练语言模型的最先进LS方法。