The generative retrieval model depends solely on the information encoded in its model parameters without external memory, its information capacity is limited and fixed. To overcome the limitation, we propose Nonparametric Decoding (Np Decoding) which can be applied to existing generative retrieval models. Np Decoding uses nonparametric contextualized vocab embeddings (external memory) rather than vanilla vocab embeddings as decoder vocab embeddings. By leveraging the contextualized vocab embeddings, the generative retrieval model is able to utilize both the parametric and nonparametric space. Evaluation over 9 datasets (8 single-hop and 1 multi-hop) in the document retrieval task shows that applying Np Decoding to generative retrieval models significantly improves the performance. We also show that Np Decoding is data- and parameter-efficient, and shows high performance in the zero-shot setting.
翻译:生成式检索模型仅依赖于编码在其模型参数中的信息,而不使用外部存储器,因此其信息容量有限且固定。为克服这一局限,我们提出非参数解码(Np Decoding),该方法可应用于现有的生成式检索模型。Np Decoding 使用非参数化的上下文化词汇嵌入(外部存储器)替代普通的词汇嵌入作为解码器的词汇嵌入。通过利用上下文化词汇嵌入,生成式检索模型能够同时利用参数空间和非参数空间。在文档检索任务中,对9个数据集(8个单跳和1个多跳数据集)的评估表明,将Np Decoding应用于生成式检索模型能够显著提升性能。我们还证明Np Decoding 具有数据高效性和参数高效性,并且在零样本设置中表现出高性能。