Efficient k-Nearest-Neighbor Machine Translation with Dynamic Retrieval

To achieve non-parametric NMT domain adaptation, $k$-Nearest-Neighbor Machine Translation ($k$NN-MT) constructs an external datastore to store domain-specific translation knowledge, which derives a $k$NN distribution to interpolate the prediction distribution of the NMT model via a linear interpolation coefficient $\lambda$. Despite its success, $k$NN retrieval at each timestep leads to substantial time overhead. To address this issue, dominant studies resort to $k$NN-MT with adaptive retrieval ($k$NN-MT-AR), which dynamically estimates $\lambda$ and skips $k$NN retrieval if $\lambda$ is less than a fixed threshold. Unfortunately, $k$NN-MT-AR does not yield satisfactory results. In this paper, we first conduct a preliminary study to reveal two key limitations of $k$NN-MT-AR: 1) the optimization gap leads to inaccurate estimation of $\lambda$ for determining $k$NN retrieval skipping, and 2) using a fixed threshold fails to accommodate the dynamic demands for $k$NN retrieval at different timesteps. To mitigate these limitations, we then propose $k$NN-MT with dynamic retrieval ($k$NN-MT-DR) that significantly extends vanilla $k$NN-MT in two aspects. Firstly, we equip $k$NN-MT with a MLP-based classifier for determining whether to skip $k$NN retrieval at each timestep. Particularly, we explore several carefully-designed scalar features to fully exert the potential of the classifier. Secondly, we propose a timestep-aware threshold adjustment method to dynamically generate the threshold, which further improves the efficiency of our model. Experimental results on the widely-used datasets demonstrate the effectiveness and generality of our model.\footnote{Our code is available at \url{https://github.com/DeepLearnXMU/knn-mt-dr}.

翻译：为实现非参数化神经机器翻译领域自适应，k近邻机器翻译（kNN-MT）通过构建外部数据存储库来存储领域特定的翻译知识，并利用线性插值系数λ推导出k近邻分布，以插值神经机器翻译模型的预测分布。尽管该方法取得了成功，但每个时间步的k近邻检索会导致显著的时间开销。为解决此问题，主流研究采用自适应检索的kNN-MT（kNN-MT-AR），该方法动态估计λ，并在λ小于固定阈值时跳过k近邻检索。然而，kNN-MT-AR并未取得令人满意的结果。本文首先通过初步研究揭示了kNN-MT-AR的两个关键局限：1) 优化差距导致λ估计不准确，难以有效判断是否跳过k近邻检索；2) 固定阈值无法适应不同时间步对k近邻检索的动态需求。为缓解这些局限，我们提出动态检索的kNN-MT（kNN-MT-DR），该方法在两个方面显著扩展了原始kNN-MT。首先，我们为kNN-MT配备基于多层感知机的分类器，用于判断每个时间步是否跳过k近邻检索。特别地，我们探索了多种精心设计的标量特征，以充分发挥分类器的潜力。其次，我们提出时间步感知的阈值调整方法，动态生成阈值，进一步提升了模型效率。在广泛使用的数据集上的实验结果表明了本模型的有效性和泛化能力。\footnote{代码开源地址：\url{https://github.com/DeepLearnXMU/knn-mt-dr}。}