Considering query variance in information retrieval (IR) experiments is beneficial for retrieval effectiveness. Especially ranking ensembles based on different topically related queries retrieve better results than rankings based on a single query alone. Recently, generative instruction-tuned Large Language Models (LLMs) improved on a variety of different tasks in capturing human language. To this end, this work explores the feasibility of using synthetic query variants generated by instruction-tuned LLMs in data fusion experiments. More specifically, we introduce a lightweight, unsupervised, and cost-efficient approach that exploits principled prompting and data fusion techniques. In our experiments, LLMs produce more effective queries when provided with additional context information on the topic. Furthermore, our analysis based on four TREC newswire benchmarks shows that data fusion based on synthetic query variants is significantly better than baselines with single queries and also outperforms pseudo-relevance feedback methods. We publicly share the code and query datasets with the community as resources for follow-up studies.
翻译:在信息检索实验中考虑查询变体有助于提升检索效能。特别是基于不同主题相关查询构建的排序集成,其检索效果优于仅依赖单一查询的排序方法。近期,生成式指令调优大语言模型在捕捉人类语言方面取得了多领域任务的性能提升。为此,本研究探索了在数据融合实验中使用指令调优大语言模型生成合成查询变体的可行性。具体而言,我们提出了一种轻量级、无监督且成本效益高的方法,该方法融合了结构化提示与数据融合技术。实验表明,当提供主题相关的附加上下文信息时,大语言模型能生成更具效能的查询。此外,基于四个TREC新闻语料基准的分析证明:采用合成查询变体的数据融合方法显著优于单一查询基线,并超越了伪相关反馈方法。我们将代码与查询数据集作为后续研究资源向学界公开共享。