Large Language Model (LLM) agents are increasingly used in real-world products, where personalized and context-aware user interactions are essential. A central enabler of such capabilities is the agent's long-term semantic memory system, which extracts implicit and explicit signals from noisy longitudinal behavioral data, stores them in a structured form, and supports low-latency retrieval. Building industrial-grade long-term memory for LLM agents raises five challenges: scalability, low-latency retrieval, privacy constraints, adaptability, and observability. We introduce the Hierarchical Long-Term Semantic Memory (HLTM) framework, which organizes textual data into a schema-aligned memory tree that captures semantic knowledge at multiple levels of granularity, enabling scalable ingestion, privacy-aware storage, low-latency retrieval, and transparent provenance; HLTM further incorporates an adaptation mechanism to generalize across diverse use cases. Extensive evaluations on LinkedIn's Hiring Assistant show that HLTM improves answer correctness by more than 5% and retrieval F1 by more than 10%, while significantly advancing the Pareto frontier between query and indexing latency. HLTM has been fully deployed in LinkedIn's Hiring Assistant to power core personalization features in production hiring workflows.
翻译:大型语言模型(LLM)智能体正越来越多地应用于实际产品中,在此类产品中,个性化且具备上下文感知能力的用户交互至关重要。实现此类能力的关键在于智能体的长期语义记忆系统,该系统能从含噪声的纵向行为数据中提取隐式和显式信号,以结构化形式进行存储,并支持低延迟检索。构建工业级LLM智能体长期记忆面临五大挑战:可扩展性、低延迟检索、隐私约束、适应性与可观测性。我们提出分层长期语义记忆(HLTM)框架,该框架将文本数据组织为符合语义模式的记忆树,在多层粒度上捕获语义知识,从而实现可扩展的摄入、隐私感知的存储、低延迟检索与透明的溯源;HLTM进一步集成了自适应机制,可泛化至多种不同应用场景。在LinkedIn招聘助手上的大量评估表明,HLTM将答案正确性提升超过5%,检索F1值提升超过10%,同时显著推进了查询与索引延迟之间的帕累托前沿。HLTM已全面部署于LinkedIn招聘助手,用于驱动实际招聘工作流中的核心个性化功能。