Various techniques have been developed in recent years to improve dense retrieval (DR), such as unsupervised contrastive learning and pseudo-query generation. Existing DRs, however, often suffer from effectiveness tradeoffs between supervised and zero-shot retrieval, which some argue was due to the limited model capacity. We contradict this hypothesis and show that a generalizable DR can be trained to achieve high accuracy in both supervised and zero-shot retrieval without increasing model size. In particular, we systematically examine the contrastive learning of DRs, under the framework of Data Augmentation (DA). Our study shows that common DA practices such as query augmentation with generative models and pseudo-relevance label creation using a cross-encoder, are often inefficient and sub-optimal. We hence propose a new DA approach with diverse queries and sources of supervision to progressively train a generalizable DR. As a result, DRAGON, our dense retriever trained with diverse augmentation, is the first BERT-base-sized DR to achieve state-of-the-art effectiveness in both supervised and zero-shot evaluations and even competes with models using more complex late interaction (ColBERTv2 and SPLADE++).
翻译:近年来,多种技术被开发用于提升稠密检索(DR),例如无监督对比学习和伪查询生成。然而,现有的稠密检索模型常在监督式检索与零样本检索之间面临效果权衡,有观点认为这是由于模型容量有限所致。我们反驳这一假设,并证明在不增加模型规模的前提下,可训练出一种泛化的稠密检索模型,在监督式与零样本检索中均实现高精度。具体而言,我们在数据增强(DA)框架下系统研究了稠密检索的对比学习。研究表明,常见的DA实践——例如使用生成模型进行查询增强、利用交叉编码器创建伪相关标签——往往效率低下且并非最优。为此,我们提出了一种新的DA方法,通过多样化查询和监督源渐进训练可泛化的稠密检索模型。最终,我们采用多样化增强训练的稠密检索器DRAGON,成为首个在监督式与零样本评估中均达到最优效果的基于BERT-base的稠密检索模型,甚至能与采用更复杂后期交互机制的模型(ColBERTv2和SPLADE++)相媲美。