We develop a two-stage retrieval system that combines multiple complementary retrieval methods with a learned reranker and LLM-based reranking, to address the TREC Tip-of-the-Tongue (ToT) task. In the first stage, we employ hybrid retrieval that merges LLM-based retrieval, sparse (BM25), and dense (BGE-M3) retrieval methods. We also introduce topic-aware multi-index dense retrieval that partitions the Wikipedia corpus into 24 topical domains. In the second stage, we evaluate both a trained LambdaMART reranker and LLM-based reranking. To support model training, we generate 5000 synthetic ToT queries using LLMs. Our best system achieves recall of 0.66 and NDCG@1000 of 0.41 on the test set by combining hybrid retrieval with Gemini-2.5-flash reranking, demonstrating the effectiveness of fusion retrieval.
翻译:我们开发了一个两阶段检索系统,该系统将多种互补的检索方法与一个学习式重排序器及基于大语言模型的重排序相结合,以应对 TREC“话在嘴边”任务。在第一阶段,我们采用混合检索,融合了基于大语言模型的检索、稀疏检索和稠密检索方法。我们还引入了主题感知的多索引稠密检索,将维基百科语料库划分为 24 个主题领域。在第二阶段,我们评估了训练好的 LambdaMART 重排序器和基于大语言模型的重排序。为支持模型训练,我们使用大语言模型生成了 5000 条合成的“话在嘴边”查询。我们通过将混合检索与 Gemini-2.5-flash 重排序相结合,在测试集上实现了 0.66 的召回率和 0.41 的 NDCG@1000,证明了融合检索的有效性。