基于检索增强生成与多目标对齐的统一查询自动补全排序与生成框架 (Unifying Ranking and Generation in Query Auto-Completion via Retrieval-Augmented Generation and Multi-Objective Alignment)

Kai Yuan,Anthony Zheng,Jia Hu,Divyanshu Sheth,Hemanth Velaga,Kylee Kim,Matteo Guarrera,Besim Avci,Jianhua Li,Xuetao Yin,Rajyashree Mukherjee,Sean Suchter

from arxiv, 11 pages, 4 figures

Query Auto-Completion (QAC) suggests query completions as users type, helping them articulate intent and reach results more efficiently. Existing approaches face fundamental challenges: traditional retrieve-and-rank pipelines have limited long-tail coverage and require extensive feature engineering, while recent generative methods suffer from hallucination and safety risks. We present a unified framework that reformulates QAC as end-to-end list generation through Retrieval-Augmented Generation (RAG) and multi-objective Direct Preference Optimization (DPO). Our approach combines three key innovations: (1) reformulating QAC as end-to-end list generation with multi-objective optimization; (2) defining and deploying a suite of rule-based, model-based, and LLM-as-judge verifiers for QAC, and using them in a comprehensive methodology that combines RAG, multi-objective DPO, and iterative critique-revision for high-quality synthetic data; (3) a hybrid serving architecture enabling efficient production deployment under strict latency constraints. Evaluation on a large-scale commercial search platform demonstrates substantial improvements: offline metrics show gains across all dimensions, human evaluation yields +0.40 to +0.69 preference scores, and a controlled online experiment achieves 5.44\% reduction in keystrokes and 3.46\% increase in suggestion adoption, validating that unified generation with RAG and multi-objective alignment provides an effective solution for production QAC. This work represents a paradigm shift to end-to-end generation powered by large language models, RAG, and multi-objective alignment, establishing a production-validated framework that can benefit the broader search and recommendation industry.

翻译：查询自动补全（QAC）在用户输入时提供查询补全建议，帮助用户更清晰地表达意图并高效获取结果。现有方法面临根本性挑战：传统的检索-排序流水线长尾覆盖能力有限且需要大量特征工程，而近期生成式方法存在幻觉与安全风险。本文提出一个统一框架，通过检索增强生成（RAG）与多目标直接偏好优化（DPO）将QAC重新定义为端到端列表生成任务。我们的方法融合三项关键创新：（1）通过多目标优化将QAC重构为端到端列表生成任务；（2）为QAC定义并部署一套基于规则、基于模型及以LLM作为评判器的验证器，结合RAG、多目标DPO与迭代批判-修订流程构建高质量合成数据；（3）支持在严格延迟约束下高效生产部署的混合服务架构。在大型商业搜索平台上的评估显示显著改进：离线指标在所有维度均获得提升，人工评估获得+0.40至+0.69的偏好分数提升，受控在线实验实现击键次数减少5.44%、建议采纳率提升3.46%，验证了基于RAG与多目标对齐的统一生成方案为生产环境QAC提供了有效解决方案。本工作标志着向以大语言模型、RAG及多目标对齐驱动的端到端生成范式转变，建立了经过生产验证的框架，可为更广泛的搜索与推荐行业提供借鉴。