We introduce replacing language model (RLM), a sequence-to-sequence language modeling framework for text style transfer (TST). Our method autoregressively replaces each token of the source sentence with a text span that has a similar meaning but in the target style. The new span is generated via a non-autoregressive masked language model, which can better preserve the local-contextual meaning of the replaced token. This RLM generation scheme gathers the flexibility of autoregressive models and the accuracy of non-autoregressive models, which bridges the gap between sentence-level and word-level style transfer methods. To control the generation style more precisely, we conduct a token-level style-content disentanglement on the hidden representations of RLM. Empirical results on real-world text datasets demonstrate the effectiveness of RLM compared with other TST baselines. The code is at https://github.com/Linear95/RLM.
翻译:我们提出了替换语言模型(RLM),一种用于文本风格迁移(TST)的序列到序列语言建模框架。我们的方法以自回归方式将源句子的每个标记替换为具有相似含义但属于目标风格的文本片段。新的片段通过非自回归掩码语言模型生成,该模型能更好地保留被替换标记的局部上下文含义。这种RLM生成方案融合了自回归模型的灵活性和非自回归模型的准确性,弥合了句子级和单词级风格迁移方法之间的差距。为了更精确地控制生成风格,我们在RLM的隐藏表示上进行了标记级风格-内容解耦。在真实文本数据集上的实证结果表明,RLM相比其他TST基线方法具有有效性。代码地址:https://github.com/Linear95/RLM。