Query expansion with large language models is promising but often relies on hand-crafted prompts, manually chosen exemplars, or a single LLM, making it non-scalable and sensitive to domain shift. We present an automated, domain-adaptive QE framework that builds in-domain exemplar pools by harvesting pseudo-relevant passages using a BM25-MonoT5 pipeline. A training-free cluster-based strategy selects diverse demonstrations, yielding strong and stable in-context QE without supervision. To further exploit model complementarity, we introduce a two-LLM ensemble in which two heterogeneous LLMs independently generate expansions and a refinement LLM consolidates them into one coherent expansion. Across TREC DL20, DBPedia, and SciFact, the refined ensemble delivers consistent and statistically significant gains over BM25, Rocchio, zero-shot, and fixed few-shot baselines. The framework offers a reproducible testbed for exemplar selection and multi-LLM generation, and a practical, label-free solution for real-world QE.
翻译:基于大语言模型的查询扩展具有良好前景,但通常依赖于手工制作的提示、人工选取的示例或单一LLM,导致其难以扩展且对领域迁移敏感。我们提出了一种自动化、领域自适应的查询扩展框架,该框架通过使用BM25-MonoT5流水线收集伪相关段落来构建领域内示例池。一种无需训练的基于聚类的策略可选取多样化的演示示例,从而在无监督条件下实现强大且稳定的上下文查询扩展。为进一步利用模型互补性,我们引入了一种双LLM集成方法:两个异构LLM独立生成扩展,再由一个精炼LLM将其整合为一个连贯的扩展。在TREC DL20、DBPedia和SciFact数据集上的实验表明,经过精炼的集成方法相较于BM25、Rocchio、零样本及固定少样本基线模型,均取得了稳定且统计显著的性能提升。该框架为示例选择与多LLM生成提供了可复现的测试平台,并为实际应用中的查询扩展提供了一种无需标注的实用解决方案。