In-context learning (ICL) greatly improves the performance of large language models (LLMs) on various down-stream tasks, where the improvement highly depends on the quality of demonstrations. In this work, we introduce syntactic knowledge to select better in-context examples for machine translation (MT). We propose a new strategy, namely Syntax-augmented COverage-based In-context example selection (SCOI), leveraging the deep syntactic structure beyond conventional word matching. Specifically, we measure the set-level syntactic coverage by computing the coverage of polynomial terms with the help of a simplified tree-to-polynomial algorithm, and lexical coverage using word overlap. Furthermore, we devise an alternate selection approach to combine both coverage measures, taking advantage of syntactic and lexical information. We conduct experiments with two multi-lingual LLMs on six translation directions. Empirical results show that our proposed SCOI obtains the highest average COMET score among all learning-free methods, indicating that combining syntactic and lexical coverage successfully helps to select better in-context examples for MT. Our code is available at https://github.com/JamyDon/SCOI.
翻译:上下文学习显著提升了大语言模型在下游任务中的性能,其改进效果高度依赖于演示示例的质量。本研究引入句法知识,为机器翻译任务选择更优的上下文示例。我们提出一种新策略——语法增强覆盖度上下文示例选择方法,该方法利用超越传统词语匹配的深层句法结构。具体而言,我们通过简化的树到多项式算法计算多项式项的覆盖度来衡量集合级句法覆盖度,同时使用词语重叠度衡量词汇覆盖度。此外,我们设计了一种交替选择方法,以结合句法与词汇信息的双重覆盖度量。我们在两种多语言大语言模型上对六个翻译方向进行了实验。实证结果表明,所提出的SCOI方法在所有无学习方法中获得了最高的平均COMET分数,这表明结合句法与词汇覆盖度能有效为机器翻译选择更优的上下文示例。代码已开源:https://github.com/JamyDon/SCOI。