质量感知翻译标注在多语言RAG系统中的应用 (Quality-Aware Translation Tagging in Multilingual RAG system)

Multilingual Retrieval-Augmented Generation (mRAG) often retrieves English documents and translates them into the query language for low-resource settings. However, poor translation quality degrades response generation performance. Existing approaches either assume sufficient translation quality or utilize the rewriting method, which introduces factual distortion and hallucinations. To mitigate these problems, we propose Quality-Aware Translation Tagging in mRAG (QTT-RAG), which explicitly evaluates translation quality along three dimensions-semantic equivalence, grammatical accuracy, and naturalness&fluency-and attach these scores as metadata without altering the original content. We evaluate QTT-RAG against CrossRAG and DKM-RAG as baselines in two open-domain QA benchmarks (XORQA, MKQA) using six instruction-tuned LLMs ranging from 2.4B to 14B parameters, covering two low-resource languages (Korean and Finnish) and one high-resource language (Chinese). QTT-RAG outperforms the baselines by preserving factual integrity while enabling generator models to make informed decisions based on translation reliability. This approach allows for effective usage of cross-lingual documents in low-resource settings with limited native language documents, offering a practical and robust solution across multilingual domains.

翻译：多语言检索增强生成（mRAG）在低资源场景中常检索英文文档并将其翻译为查询语言。然而，低质量的翻译会降低响应生成性能。现有方法要么假设翻译质量足够，要么采用重写方法，但这会引入事实扭曲和幻觉问题。为缓解这些问题，我们提出质量感知翻译标注的mRAG方法（QTT-RAG），该方法从语义对等性、语法准确性和自然流畅性三个维度显式评估翻译质量，并将评分作为元数据附加而不改变原始内容。我们在两个开放域问答基准（XORQA、MKQA）上，使用六个参数量从24亿到140亿的指令微调大语言模型，以CrossRAG和DKM-RAG为基线评估QTT-RAG，涵盖两种低资源语言（韩语和芬兰语）和一种高资源语言（中文）。QTT-RAG通过保持事实完整性，使生成模型能基于翻译可靠性做出知情决策，从而优于基线方法。该方法能在低资源场景中有效利用跨语言文档，为多语言领域提供了实用且鲁棒的解决方案。