Predicting a user's next search query from recent interaction behaviors is a critical problem in modern e-commerce systems, particularly in scenarios where user intent evolves rapidly. Large Language Models (LLMs) offer strong semantic reasoning capabilities and have recently been adopted to enhance training data construction for next-query prediction. However, due to resource constraints on mobile devices, existing applications are deployed on cloud servers, resulting in high inference costs. In this paper, we propose RecGPT-Mobile, a framework that designs a lightweight LLM-based intent understanding agent to improve recommendation quality in mobile e-commerce scenarios. By deploying LLMs directly on mobile devices, our approach can capture evolving interests of users more quickly and adjust the recommendation results in real time. Extensive offline analyses and online experiments demonstrate that our method significantly improves the accuracy of recommendation results, laying a practical path for LLM deployment in production-scale recommendation systems on mobile devices, as well as a scalable solution for integrating LLMs into real-world next-query prediction systems.
翻译:从用户近期交互行为预测其下一个搜索查询是现代电商系统中的关键问题,尤其在用户意图快速演变的场景中尤为突出。大语言模型具有强大的语义推理能力,近年来已被用于增强下一查询预测的训练数据构建。然而,受限于移动设备的资源约束,现有应用均部署于云端服务器,导致推理成本高昂。本文提出RecGPT-Mobile框架,设计了一种轻量级基于LLM的意图理解智能体,以提升移动电商场景下的推荐质量。通过直接在移动设备上部署大语言模型,我们的方法能更快速地捕捉用户不断演变的兴趣,并实时调整推荐结果。大量离线分析与在线实验表明,我们的方法显著提升了推荐结果的准确性,为在移动设备上生产级推荐系统中部署LLM提供了实用路径,也为将LLM集成到实际下一查询预测系统提供了可扩展的解决方案。