Knowledge-intensive language tasks (KILTs) benefit from retrieving high-quality relevant contexts from large external knowledge corpora. Learning task-specific retrievers that return relevant contexts at an appropriate level of semantic granularity, such as a document retriever, passage retriever, sentence retriever, and entity retriever, may help to achieve better performance on the end-to-end task. But a task-specific retriever usually has poor generalization ability to new domains and tasks, and it may be costly to deploy a variety of specialised retrievers in practice. We propose a unified generative retriever (UGR) that combines task-specific effectiveness with robust performance over different retrieval tasks in KILTs. To achieve this goal, we make two major contributions: (i) To unify different retrieval tasks into a single generative form, we introduce an n-gram-based identifier for relevant contexts at different levels of granularity in KILTs. And (ii) to address different retrieval tasks with a single model, we employ a prompt learning strategy and investigate three methods to design prompt tokens for each task. In this way, the proposed UGR model can not only share common knowledge across tasks for better generalization, but also perform different retrieval tasks effectively by distinguishing task-specific characteristics. We train UGR on a heterogeneous set of retrieval corpora with well-designed prompts in a supervised and multi-task fashion. Experimental results on the KILT benchmark demonstrate the effectiveness of UGR on in-domain datasets, out-of-domain datasets, and unseen tasks.
翻译:知识密集型语言任务受益于从大规模外部知识库中检索高质量的相关上下文。学习能够返回适当语义粒度相关上下文(如文档检索器、段落检索器、句子检索器和实体检索器)的任务特定检索器,有助于提升端到端任务的性能。但任务特定检索器通常对新领域和新任务的泛化能力较差,且在实际部署多种专用检索器成本较高。我们提出了一种统一生成式检索器,它结合了任务特定有效性与在知识密集型语言任务中不同检索任务上的稳健性能。为实现这一目标,我们做出两项主要贡献:(i) 为将不同检索任务统一为单一生成形式,我们引入了一种基于n元语法的标识符,用于知识密集型语言任务中不同粒度级别的相关上下文。(ii) 为使用单一模型处理不同检索任务,我们采用了提示学习策略,并研究了三种方法来为每个任务设计提示标记。通过这种方式,所提出的统一生成式检索器不仅能跨任务共享通用知识以实现更好的泛化,还能通过区分任务特定特征有效执行不同检索任务。我们使用精心设计的提示,在异构检索语料库上以监督式多任务方式训练统一生成式检索器。在知识密集型语言任务基准上的实验结果表明,统一生成式检索器在领域内数据集、领域外数据集及未见任务上均具有有效性。