Although neural machine translation (NMT) models perform well in the general domain, it remains rather challenging to control their generation behavior to satisfy the requirement of different users. Given the expensive training cost and the data scarcity challenge of learning a new model from scratch for each user requirement, we propose a memory-augmented adapter to steer pretrained NMT models in a pluggable manner. Specifically, we construct a multi-granular memory based on the user-provided text samples and propose a new adapter architecture to combine the model representations and the retrieved results. We also propose a training strategy using memory dropout to reduce spurious dependencies between the NMT model and the memory. We validate our approach on both style- and domain-specific experiments and the results indicate that our method can outperform several representative pluggable baselines.
翻译:尽管神经机器翻译(NMT)模型在通用领域表现良好,但控制其生成行为以满足不同用户的需求仍颇具挑战性。考虑到为每个用户需求从头训练新模型存在昂贵的训练成本和数据稀缺问题,我们提出一种记忆增强适配器,以可插拔方式引导预训练NMT模型。具体而言,我们基于用户提供的文本样本构建多粒度记忆,并设计一种新型适配器架构以融合模型表示与检索结果。同时,我们提出一种采用记忆丢弃的训练策略,以减少NMT模型与记忆之间的虚假依赖。我们在风格特定和领域特定实验上验证了该方法,结果表明我们的方法可超越多个代表性可插拔基线模型。