Decomposition-based multi-hop retrieval methods rely on many autoregressive steps to break down complex queries, which breaks end-to-end differentiability and is computationally expensive. Decomposition-free methods tackle this, but current decomposition-free approaches struggle with longer multi-hop problems and generalization to out-of-distribution data. To address these challenges, we introduce GRITHopper-7B, a novel multi-hop dense retrieval model that achieves state-of-the-art performance on both in-distribution and out-of-distribution benchmarks. GRITHopper combines generative and representational instruction tuning by integrating causal language modeling with dense retrieval training. Through controlled studies, we find that incorporating additional context after the retrieval process, referred to as post-retrieval language modeling, enhances dense retrieval performance. By including elements such as final answers during training, the model learns to better contextualize and retrieve relevant information. GRITHopper-7B offers a robust, scalable, and generalizable solution for multi-hop dense retrieval, and we release it to the community for future research and applications requiring multi-hop reasoning and retrieval capabilities.
翻译:基于分解的多跳检索方法依赖大量自回归步骤来分解复杂查询,这破坏了端到端的可微性且计算成本高昂。无需分解的方法旨在解决此问题,但当前的无分解方法在处理较长多跳问题和泛化至分布外数据方面仍存在困难。为应对这些挑战,我们提出了GRITHopper-7B——一种新型多跳密集检索模型,其在分布内和分布外基准测试中均实现了最先进的性能。GRITHopper通过将因果语言建模与密集检索训练相结合,实现了生成式与表征式指令调优的融合。通过对照实验,我们发现检索过程后引入额外上下文(称为检索后语言建模)能有效提升密集检索性能。通过在训练阶段融入最终答案等要素,模型能更好地学习上下文关联与相关信息检索。GRITHopper-7B为多跳密集检索提供了鲁棒、可扩展且泛化能力强的解决方案,我们将其开源发布,以支持未来需要多跳推理与检索能力的研究与应用。