Industrial-scale user representation learning requires balancing robust universality with acute task-sensitivity. However, existing paradigms primarily yield static, task-agnostic embeddings that struggle to reconcile the divergent requirements of downstream scenarios within unified vector spaces. Furthermore, heterogeneous multi-source data introduces inherent noise and modality conflicts, degrading representation. We propose Query-as-Anchor, a framework shifting user modeling from static encoding to dynamic, query-aware synthesis. To empower Large Language Models (LLMs) with deep user understanding, we first construct UserU, an industrial-scale pre-training dataset that aligns multi-modal behavioral sequences with user understanding semantics, and our Q-Anchor Embedding architecture integrates hierarchical coarse-to-fine encoders into dual-tower LLMs via joint contrastive-autoregressive optimization for query-aware user representation. To bridge the gap between general pre-training and specialized business logic, we further introduce Cluster-based Soft Prompt Tuning to enforce discriminative latent structures, effectively aligning model attention with scenario-specific modalities. For deployment, anchoring queries at sequence termini enables KV-cache-accelerated inference with negligible incremental latency. Evaluations on 10 Alipay industrial benchmarks show consistent SOTA performance, strong scalability, and efficient deployment. Large-scale online A/B testing in Alipay's production system across two real-world scenarios further validates its practical effectiveness. Our code is prepared for public release and will be available at: https://github.com/JhCircle/Q-Anchor.
翻译:工业级用户表征学习需要在鲁棒的通用性与敏锐的任务敏感性之间取得平衡。然而,现有范式主要产生静态的、与任务无关的嵌入,难以在统一的向量空间中调和下游场景的差异化需求。此外,异构多源数据引入了固有的噪声和模态冲突,降低了表征质量。我们提出"查询即锚点"框架,将用户建模从静态编码转向动态的、查询感知的合成。为使大语言模型具备深度的用户理解能力,我们首先构建了UserU——一个工业规模的预训练数据集,它将多模态行为序列与用户理解语义对齐;我们的Q-Anchor Embedding架构通过联合对比-自回归优化,将分层粗到细编码器集成到双塔大语言模型中,以实现查询感知的用户表征。为弥合通用预训练与专用业务逻辑之间的差距,我们进一步引入基于聚类的软提示调优,以强化判别性潜在结构,有效将模型注意力与场景特定模态对齐。在部署方面,将锚点查询置于序列末端可实现KV缓存加速推理,且增量延迟可忽略不计。在10个支付宝工业基准测试上的评估表明,该方法取得了持续的最先进性能、强大的可扩展性和高效的部署能力。在支付宝生产系统中对两个真实场景进行的大规模在线A/B测试进一步验证了其实际有效性。我们的代码已准备公开发布,并将发布于:https://github.com/JhCircle/Q-Anchor。