In the rapidly evolving landscape of information retrieval, search engines strive to provide more personalized and relevant results to users. Query suggestion systems play a crucial role in achieving this goal by assisting users in formulating effective queries. However, existing query suggestion systems mainly rely on textual inputs, potentially limiting user search experiences for querying images. In this paper, we introduce a novel Multimodal Query Suggestion (MMQS) task, which aims to generate query suggestions based on user query images to improve the intentionality and diversity of search results. We present the RL4Sugg framework, leveraging the power of Large Language Models (LLMs) with Multi-Agent Reinforcement Learning from Human Feedback to optimize the generation process. Through comprehensive experiments, we validate the effectiveness of RL4Sugg, demonstrating a 18% improvement compared to the best existing approach. Moreover, the MMQS has been transferred into real-world search engine products, which yield enhanced user engagement. Our research advances query suggestion systems and provides a new perspective on multimodal information retrieval.
翻译:在信息检索快速发展的背景下,搜索引擎致力于为用户提供更加个性化和相关的结果。查询建议系统通过帮助用户制定有效的查询语句,在实现这一目标中发挥着关键作用。然而,现有的查询建议系统主要依赖文本输入,可能导致用户在查询图像时的搜索体验受限。本文提出了一个新颖的多模态查询建议(MMQS)任务,旨在基于用户查询图像生成查询建议,以提升搜索结果的意图性和多样性。我们提出了RL4Sugg框架,借助大型语言模型(LLMs)的能力,结合基于人类反馈的多智能体强化学习来优化生成过程。通过全面的实验验证了RL4Sugg的有效性,其相比现有最佳方法取得了18%的性能提升。此外,MMQS已成功应用于现实搜索引擎产品中,显著提升了用户参与度。本研究推进了查询建议系统的发展,为多模态信息检索提供了新的视角。