The recent advancement in language representation modeling has broadly affected the design of dense retrieval models. In particular, many of the high-performing dense retrieval models evaluate representations of query and document using BERT, and subsequently apply a cosine-similarity based scoring to determine the relevance. BERT representations, however, are known to follow an anisotropic distribution of a narrow cone shape and such an anisotropic distribution can be undesirable for the cosine-similarity based scoring. In this work, we first show that BERT-based DR also follows an anisotropic distribution. To cope with the problem, we introduce unsupervised post-processing methods of Normalizing Flow and whitening, and develop token-wise method in addition to the sequence-wise method for applying the post-processing methods to the representations of dense retrieval models. We show that the proposed methods can effectively enhance the representations to be isotropic, then we perform experiments with ColBERT and RepBERT to show that the performance (NDCG at 10) of document re-ranking can be improved by 5.17\%$\sim$8.09\% for ColBERT and 6.88\%$\sim$22.81\% for RepBERT. To examine the potential of isotropic representation for improving the robustness of DR models, we investigate out-of-distribution tasks where the test dataset differs from the training dataset. The results show that isotropic representation can achieve a generally improved performance. For instance, when training dataset is MS-MARCO and test dataset is Robust04, isotropy post-processing can improve the baseline performance by up to 24.98\%. Furthermore, we show that an isotropic model trained with an out-of-distribution dataset can even outperform a baseline model trained with the in-distribution dataset.
翻译:近期语言表征建模的进展广泛影响了稠密检索模型的设计。具体而言,许多高性能稠密检索模型使用BERT评估查询与文档的表示,随后基于余弦相似度评分判定相关性。然而,已知BERT表示遵循窄锥形各向异性分布,这种各向异性分布可能不利于基于余弦相似度的评分。本研究首先证明基于BERT的稠密检索同样呈现各向异性分布。为解决该问题,我们引入归一化流与白化两种无监督后处理方法,并针对稠密检索模型的表示分别开发了序列级方法与词元级方法。实验表明,所提方法能有效增强表示的各向同性;基于ColBERT与RepBERT的文档重排序实验显示,ColBERT的NDCG@10性能提升5.17%~8.09%,RepBERT提升6.88%~22.81%。为探究各向同性表示增强稠密检索模型鲁棒性的潜力,我们进一步研究了测试集与训练集存在差异的分布外任务。结果表明,各向同性表示可普遍提升模型性能。例如,当训练集为MS-MARCO、测试集为Robust04时,各向同性后处理可使基线性能最高提升24.98%。此外,我们证明使用分布外数据集训练的各向同性模型甚至能超越使用分布内数据集训练的基线模型。