Generating Query Recommendations via LLMs

Query recommendation systems are ubiquitous in modern search engines, assisting users in producing effective queries to meet their information needs. However, these systems require a large amount of data to produce good recommendations, such as a large collection of documents to index and query logs. In particular, query logs and user data are not available in cold start scenarios. Query logs are expensive to collect and maintain and require complex and time-consuming cascading pipelines for creating, combining, and ranking recommendations. To address these issues, we frame the query recommendation problem as a generative task, proposing a novel approach called Generative Query Recommendation (GQR). GQR uses an LLM as its foundation and does not require to be trained or fine-tuned to tackle the query recommendation problem. We design a prompt that enables the LLM to understand the specific recommendation task, even using a single example. We then improved our system by proposing a version that exploits query logs called Retriever-Augmented GQR (RA-GQR). RA-GQr dynamically composes its prompt by retrieving similar queries from query logs. GQR approaches reuses a pre-existing neural architecture resulting in a simpler and more ready-to-market approach, even in a cold start scenario. Our proposed GQR obtains state-of-the-art performance in terms of NDCG@10 and clarity score against two commercial search engines and the previous state-of-the-art approach on the Robust04 and ClueWeb09B collections, improving on average the NDCG@10 performance up to ~4% on Robust04 and ClueWeb09B w.r.t the previous best competitor. RA-GQR further improve the NDCG@10 obtaining an increase of ~11%, ~6\% on Robust04 and ClueWeb09B w.r.t the best competitor. Furthermore, our system obtained ~59% of user preferences in a blind user study, proving that our method produces the most engaging queries.

翻译：查询推荐系统在现代搜索引擎中无处不在，其通过协助用户生成有效查询以满足其信息需求。然而，这些系统需要大量数据才能产生良好的推荐，例如用于索引的大量文档集合以及查询日志。特别是在冷启动场景中，查询日志和用户数据往往无法获取。查询日志的收集与维护成本高昂，且需要复杂且耗时的级联流水线来创建、组合及排序推荐。为解决这些问题，我们将查询推荐问题构建为一项生成式任务，并提出一种名为生成式查询推荐（GQR）的新方法。GQR 以大型语言模型（LLM）为基础，无需针对查询推荐问题进行训练或微调。我们设计了一种提示，使 LLM 能够理解特定的推荐任务，即使仅使用单个示例亦可实现。随后，我们通过提出一种利用查询日志的版本——检索增强型 GQR（RA-GQR）——改进了系统。RA-GQR 通过从查询日志中检索相似查询来动态构建其提示。GQR 方法复用已有的神经架构，从而形成一种更简洁、更易于推向市场的方法，即使在冷启动场景中亦是如此。我们提出的 GQR 在 NDCG@10 和清晰度分数方面，相较于两个商业搜索引擎及先前的最先进方法，在 Robust04 和 ClueWeb09B 数据集上取得了最先进的性能，将 NDCG@10 性能平均提升了约 4%（相对于先前的最佳竞争对手）。RA-GQR 进一步提升了 NDCG@10，在 Robust04 和 ClueWeb09B 数据集上分别获得了约 11% 和约 6% 的提升（相对于最佳竞争对手）。此外，我们的系统在盲测用户研究中获得了约 59% 的用户偏好，证明我们的方法能够生成最具吸引力的查询。