In recent research, contrastive learning has proven to be a highly effective method for representation learning and is widely used for dense retrieval. However, we identify that relying solely on contrastive learning can lead to suboptimal retrieval performance. On the other hand, despite many retrieval datasets supporting various learning objectives beyond contrastive learning, combining them efficiently in multi-task learning scenarios can be challenging. In this paper, we introduce M3, an advanced recursive Multi-hop dense sentence retrieval system built upon a novel Multi-task Mixed-objective approach for dense text representation learning, addressing the aforementioned challenges. Our approach yields state-of-the-art performance on a large-scale open-domain fact verification benchmark dataset, FEVER. Code and data are available at: https://github.com/TonyBY/M3
翻译:近期研究表明,对比学习作为表征学习的高效方法,已被广泛应用于密集检索任务。然而,我们发现仅依赖对比学习会导致检索性能次优。另一方面,尽管许多检索数据集支持对比学习之外的多种学习目标,但在多任务学习场景中如何高效组合这些目标仍具挑战性。本文提出M3——一种基于新型多任务混合目标方法构建的先进递归式多跳密集句子检索系统,用于解决上述挑战。该方法在大规模开放域事实验证基准数据集FEVER上取得了最先进性能。代码与数据已开源至:https://github.com/TonyBY/M3