Query expansion with large language models is promising but often relies on hand-crafted prompts, manually chosen exemplars, or a single LLM, making it non-scalable and sensitive to domain shift. We present an automated, domain-adaptive QE framework that builds in-domain exemplar pools by harvesting pseudo-relevant passages using a BM25-MonoT5 pipeline. A training-free cluster-based strategy selects diverse demonstrations, yielding strong and stable in-context QE without supervision. To further exploit model complementarity, we introduce a two-LLM ensemble in which two heterogeneous LLMs independently generate expansions and a refinement LLM consolidates them into one coherent expansion. Across TREC DL20, DBPedia, and SciFact, the refined ensemble delivers consistent and statistically significant gains over BM25, Rocchio, zero-shot, and fixed few-shot baselines. The framework offers a reproducible testbed for exemplar selection and multi-LLM generation, and a practical, label-free solution for real-world QE.
翻译:基于大型语言模型的查询扩展方法前景广阔,但通常依赖于手工制作的提示、人工选择的范例或单一LLM,导致其难以扩展且对领域迁移敏感。我们提出了一种自动化、领域自适应的查询扩展框架,该框架通过使用BM25-MonoT5流水线收集伪相关段落来构建领域内范例池。一种无需训练的基于聚类的策略选择多样化的演示,从而在无监督情况下产生强大且稳定的上下文查询扩展。为了进一步利用模型的互补性,我们引入了一个双LLM集成方法,其中两个异构LLM独立生成扩展,再由一个精化LLM将它们整合成一个连贯的扩展。在TREC DL20、DBPedia和SciFact数据集上的实验表明,经过精化的集成方法相较于BM25、Rocchio、零样本以及固定少样本基线方法,均取得了持续且统计显著的性能提升。该框架为范例选择和多LLM生成提供了一个可复现的测试平台,并为现实世界的查询扩展提供了一个实用的、无需标注的解决方案。