Query Processing (QP) bridges user intent and content supply in large-scale Social Network Service (SNS) search engines. Traditional QP systems rely on pipelines of isolated discriminative models (e.g., BERT), suffering from limited semantic understanding and high maintenance overhead. While Large Language Models (LLMs) offer a potential solution, existing approaches often optimize sub-tasks in isolation, neglecting intrinsic semantic synergy and necessitating independent iterations. Moreover, standard generative methods often lack grounding in SNS scenarios, failing to bridge the gap between open-domain corpora and informal SNS linguistic patterns, while struggling to adhere to rigorous business definitions. We present QP-OneModel, a Unified Generative LLM for Multi-Task Query Understanding in the SNS domain. We reformulate heterogeneous sub-tasks into a unified sequence generation paradigm, adopting a progressive three-stage alignment strategy culminating in multi-reward Reinforcement Learning. Furthermore, QP-OneModel generates intent descriptions as a novel high-fidelity semantic signal, effectively augmenting downstream tasks such as query rewriting and ranking. Offline evaluations show QP-OneModel achieves a 7.35% overall gain over discriminative baselines, with significant F1 boosts in NER (+9.01%) and Term Weighting (+9.31%). It also exhibits superior generalization, surpassing a 32B model by 7.60% accuracy on unseen tasks. Fully deployed at Xiaohongshu, online A/B tests confirm its industrial value, optimizing retrieval relevance (DCG) by 0.21% and lifting user retention by 0.044%.
翻译:查询处理(QP)在大规模社交网络服务(SNS)搜索引擎中连接用户意图与内容供给。传统的QP系统依赖于由孤立判别模型(例如BERT)构成的流水线,存在语义理解有限和维护开销高的问题。虽然大语言模型(LLMs)提供了一种潜在的解决方案,但现有方法通常孤立地优化子任务,忽略了内在的语义协同效应,且需要进行独立的迭代更新。此外,标准的生成方法通常缺乏对SNS场景的针对性,未能弥合开放域语料库与非正式SNS语言模式之间的差距,同时也难以遵循严格的业务定义。我们提出了QP-OneModel,一个面向SNS领域多任务查询理解的统一生成式大语言模型。我们将异构的子任务重新表述为统一的序列生成范式,采用渐进式三阶段对齐策略,最终结合多奖励强化学习进行优化。此外,QP-OneModel生成意图描述作为一种新颖的高保真语义信号,有效增强了查询改写和排序等下游任务。离线评估表明,QP-OneModel相比判别式基线模型实现了7.35%的整体性能提升,在命名实体识别(NER,+9.01%)和词权重计算(Term Weighting,+9.31%)任务上的F1分数显著提高。该模型还展现出卓越的泛化能力,在未见任务上的准确率超越了320亿参数模型7.60%。QP-OneModel已在小红书全面部署,在线A/B测试证实了其工业价值,将检索相关性(DCG)优化了0.21%,并将用户留存率提升了0.044%。