Although neural machine translation (NMT) models perform well in the general domain, it remains rather challenging to control their generation behavior to satisfy the requirement of different users. Given the expensive training cost and the data scarcity challenge of learning a new model from scratch for each user requirement, we propose a memory-augmented adapter to steer pretrained NMT models in a pluggable manner. Specifically, we construct a multi-granular memory based on the user-provided text samples and propose a new adapter architecture to combine the model representations and the retrieved results. We also propose a training strategy using memory dropout to reduce spurious dependencies between the NMT model and the memory. We validate our approach on both style- and domain-specific experiments and the results indicate that our method can outperform several representative pluggable baselines.
翻译:尽管神经机器翻译(NMT)模型在通用领域表现良好,但控制其生成行为以满足不同用户的需求仍然颇具挑战。鉴于为每个用户需求从头学习新模型的训练成本高昂且面临数据稀缺问题,我们提出了一种记忆增强适配器,以可插拔的方式引导预训练NMT模型。具体而言,我们基于用户提供的文本样本构建了多粒度记忆,并设计了一种新型适配器架构以融合模型表征与检索结果。我们还提出了一种采用记忆丢弃的训练策略,用于减少NMT模型与记忆之间的虚假依赖关系。我们通过风格特定和领域特定实验验证了该方法,结果表明其能够优于多种代表性的可插拔基线模型。