推荐系统近零开销的新鲜度保障：基于推理端模型更新 (Near-Zero-Overhead Freshness for Recommendation Systems via Inference-Side Model Updates) - 专知论文

会员服务 ·

0

系统 · 模型更新 · 零开销 · 低秩 · 自适应 ·

2025 年 12 月 17 日

Near-Zero-Overhead Freshness for Recommendation Systems via Inference-Side Model Updates

翻译：推荐系统近零开销的新鲜度保障：基于推理端模型更新

Wenjun Yu,Sitian Chen,Cheng Chen,Amelie Chi Zhou

from arxiv, Accepted by HPCA 2026

Deep Learning Recommendation Models (DLRMs) underpin personalized services but face a critical freshness-accuracy tradeoff due to massive parameter synchronization overheads. Production DLRMs deploy decoupled training/inference clusters, where synchronizing petabyte-scale embedding tables (EMTs) causes multi-minute staleness, degrading recommendation quality and revenue. We observe that (1) inference nodes exhibit sustained CPU underutilization (peak <= 20%), and (2) EMT gradients possess intrinsic low-rank structure, enabling compact update representation. We present LiveUpdate, a system that eliminates inter-cluster synchronization by colocating Low-Rank Adaptation (LoRA) trainers within inference nodes. LiveUpdate addresses two core challenges: (1) dynamic rank adaptation via singular value monitoring to constrain memory overhead (<2% of EMTs), and (2) NUMA-aware resource scheduling with hardware-enforced QoS to eliminate update inference contention (P99 latency impact <20ms). Evaluations show LiveUpdate reduces update costs by 2x versus delta-update baselines while achieving higher accuracy within 1-hour windows. By transforming idle inference resources into freshness engines, LiveUpdate delivers online model updates while outperforming state-of-the-art delta-update methods by 0.04% to 0.24% in accuracy.

翻译：深度学习推荐模型（DLRMs）支撑着个性化服务，但由于海量参数同步开销，面临关键的新鲜度-准确性权衡。生产环境中的DLRMs采用解耦的训练/推理集群架构，同步PB级嵌入表（EMTs）会导致数分钟的陈旧性，从而降低推荐质量和收入。我们观察到：（1）推理节点持续存在CPU利用率不足（峰值≤20%）；（2）EMT梯度具有内在的低秩结构，可实现紧凑的更新表示。本文提出LiveUpdate系统，通过在推理节点内部署低秩自适应（LoRA）训练器，消除集群间同步开销。LiveUpdate解决了两个核心挑战：（1）通过奇异值监测实现动态秩自适应，以控制内存开销（<EMTs的2%）；（2）采用NUMA感知的资源调度与硬件强化的服务质量保障，消除更新对推理的干扰（P99延迟影响<20毫秒）。评估表明，LiveUpdate相比增量更新基线将更新成本降低2倍，并在1小时窗口内实现更高的准确性。通过将闲置推理资源转化为新鲜度引擎，LiveUpdate在提供在线模型更新的同时，其准确性较最先进的增量更新方法提升0.04%至0.24%。

0

相关内容

Deep Research（深度研究）：系统性综述

Deep Research（深度研究）：系统性综述

专知会员服务

50+阅读 · 2025年12月3日

DeepSeek模型综述：V1 V2 V3 R1-Zero

DeepSeek模型综述：V1 V2 V3 R1-Zero

专知会员服务

116+阅读 · 2025年2月11日

【KDD2023】半监督图不平衡回归

【KDD2023】半监督图不平衡回归

专知会员服务

26+阅读 · 2023年5月24日

【KDD2021】半个性化用户冷启动推荐系统

专知会员服务

26+阅读 · 2021年6月9日

KG-BERT：基于BERT的知识图谱补全，KG-BERT: BERT for Knowledge Graph Completion

KG-BERT：基于BERT的知识图谱补全，KG-BERT: BERT for Knowledge Graph Completion

专知会员服务

195+阅读 · 2020年5月31日

【KDD2020-阿里巴巴】M2GRL-多任务多视角图表示学习的Web级推荐系统

【KDD2020-阿里巴巴】M2GRL-多任务多视角图表示学习的Web级推荐系统

专知会员服务

37+阅读 · 2020年5月22日

GeoffreyHinton-ICML2020投稿论文-偏转对抗攻击 Deflecting Adversarial Attacks

GeoffreyHinton-ICML2020投稿论文-偏转对抗攻击 Deflecting Adversarial Attacks

专知会员服务

24+阅读 · 2020年2月22日

【Google无监督大规模视觉表示迁移】Large Scale Learning of General Visual Representations for Transfer

【Google无监督大规模视觉表示迁移】Large Scale Learning of General Visual Representations for Transfer

专知会员服务

12+阅读 · 2020年1月7日

实时强化学习《Real-Time Reinforcement Learning》S Ramstedt, C Pal [Mila, Element AI] (2019)

实时强化学习《Real-Time Reinforcement Learning》S Ramstedt, C Pal [Mila, Element AI] (2019)

专知会员服务

13+阅读 · 2019年11月17日

【RecSys 2019报告】基于对话的推荐（Context Adaptation with Session‐based Recommenders）

【RecSys 2019报告】基于对话的推荐（Context Adaptation with Session‐based Recommenders）

专知会员服务

33+阅读 · 2019年9月20日

Kaggle知识点：伪标签Pseudo Label

Kaggle知识点：伪标签Pseudo Label

AINLP

40+阅读 · 2020年8月9日

【CVPR 2020 Oral】小样本类增量学习

【CVPR 2020 Oral】小样本类增量学习

专知

20+阅读 · 2020年6月26日

iOS如何区分App和SDK内部crash

iOS如何区分App和SDK内部crash

CocoaChina

11+阅读 · 2019年4月17日

干货——图像分类（下）

干货——图像分类（下）

计算机视觉战队

14+阅读 · 2018年8月28日

论文笔记之Feature Selective Networks for Object Detection

论文笔记之Feature Selective Networks for Object Detection

统计学习与视觉计算组

21+阅读 · 2018年7月26日

误差反向传播——CNN

误差反向传播——CNN

统计学习与视觉计算组

31+阅读 · 2018年7月12日

CosFace: Large Margin Cosine Loss for Deep Face Recognition论文笔记

CosFace: Large Margin Cosine Loss for Deep Face Recognition论文笔记

统计学习与视觉计算组

44+阅读 · 2018年4月25日

读论文Discriminative Deep Metric Learning for Face and KV

读论文Discriminative Deep Metric Learning for Face and KV

统计学习与视觉计算组

12+阅读 · 2018年4月6日

论文浅尝 | Know-Evolve: Deep Temporal Reasoning for Dynamic KG

论文浅尝 | Know-Evolve: Deep Temporal Reasoning for Dynamic KG

开放知识图谱

36+阅读 · 2018年3月30日

LibRec 每周算法：DeepFM

LibRec 每周算法：DeepFM

LibRec智能推荐

14+阅读 · 2017年11月6日

结合知识图谱的概率话题模型研究

国家自然科学基金

10+阅读 · 2015年12月31日

分布式有监督学习的学习理论

国家自然科学基金

17+阅读 · 2015年12月31日

基于高效蒙特卡罗策略的最优化方法及应用研究

国家自然科学基金

9+阅读 · 2015年12月31日

社会网秘密共享中的关键问题研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

46+阅读 · 2015年12月31日

基于非对称群体兴趣相关性并融合情境与群体信任的Web服务推荐研究

国家自然科学基金

1+阅读 · 2015年12月31日

复杂多元数据的半参数统计推断

国家自然科学基金

5+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

基于第三方的APP软件质量度量和评估方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

开放动态环境下在线机器学习理论与方法

国家自然科学基金

11+阅读 · 2013年12月31日

DeepSeek-V3 Technical Report

Arxiv

18+阅读 · 2024年12月27日

Is ChatGPT a Good Recommender? A Preliminary Study

Arxiv

175+阅读 · 2023年4月20日

NeuralField-LDM: Scene Generation with Hierarchical Latent Diffusion Models

Arxiv

42+阅读 · 2023年4月19日

A Comprehensive Survey on Deep Graph Representation Learning

Arxiv

109+阅读 · 2023年4月11日

A Survey of Large Language Models

A Survey of Large Language Models

Arxiv

499+阅读 · 2023年3月31日

Unleashing the Power of Edge-Cloud Generative AI in Mobile Networks: A Survey of AIGC Services

Arxiv

154+阅读 · 2023年3月29日

Knowledge Graphs: Opportunities and Challenges

Arxiv

181+阅读 · 2023年3月24日

Sparks of Artificial General Intelligence: Early experiments with GPT-4

Arxiv

51+阅读 · 2023年3月22日

A Complete Survey on Generative AI (AIGC): Is ChatGPT from GPT-4 to GPT-5 All You Need?

Arxiv

88+阅读 · 2023年3月21日

Data-centric Artificial Intelligence: A Survey

Arxiv

27+阅读 · 2023年3月17日

VIP会员

文章信息

相关主题

相关VIP内容

Deep Research（深度研究）：系统性综述

Deep Research（深度研究）：系统性综述

专知会员服务

50+阅读 · 2025年12月3日

DeepSeek模型综述：V1 V2 V3 R1-Zero

DeepSeek模型综述：V1 V2 V3 R1-Zero

专知会员服务

116+阅读 · 2025年2月11日

【KDD2023】半监督图不平衡回归

【KDD2023】半监督图不平衡回归

专知会员服务

26+阅读 · 2023年5月24日

【KDD2021】半个性化用户冷启动推荐系统

专知会员服务

26+阅读 · 2021年6月9日

KG-BERT：基于BERT的知识图谱补全，KG-BERT: BERT for Knowledge Graph Completion

KG-BERT：基于BERT的知识图谱补全，KG-BERT: BERT for Knowledge Graph Completion

专知会员服务

195+阅读 · 2020年5月31日

【KDD2020-阿里巴巴】M2GRL-多任务多视角图表示学习的Web级推荐系统

【KDD2020-阿里巴巴】M2GRL-多任务多视角图表示学习的Web级推荐系统

专知会员服务

37+阅读 · 2020年5月22日

GeoffreyHinton-ICML2020投稿论文-偏转对抗攻击 Deflecting Adversarial Attacks

GeoffreyHinton-ICML2020投稿论文-偏转对抗攻击 Deflecting Adversarial Attacks

专知会员服务

24+阅读 · 2020年2月22日

【Google无监督大规模视觉表示迁移】Large Scale Learning of General Visual Representations for Transfer

【Google无监督大规模视觉表示迁移】Large Scale Learning of General Visual Representations for Transfer

专知会员服务

12+阅读 · 2020年1月7日

实时强化学习《Real-Time Reinforcement Learning》S Ramstedt, C Pal [Mila, Element AI] (2019)

实时强化学习《Real-Time Reinforcement Learning》S Ramstedt, C Pal [Mila, Element AI] (2019)

专知会员服务

13+阅读 · 2019年11月17日

【RecSys 2019报告】基于对话的推荐（Context Adaptation with Session‐based Recommenders）

【RecSys 2019报告】基于对话的推荐（Context Adaptation with Session‐based Recommenders）

专知会员服务

33+阅读 · 2019年9月20日

热门VIP内容

开通专知VIP会员享更多权益服务

论学习、公平性与复杂度

《整合杀伤链：一个用于边缘目标验证与战术推理的零样本框架》最新资料

2025中国人工智能学会系列白皮书⸺棋盘上的人工智能|附下载

通用智能体评估的逻辑架构

相关资讯

Kaggle知识点：伪标签Pseudo Label

Kaggle知识点：伪标签Pseudo Label

AINLP

40+阅读 · 2020年8月9日

【CVPR 2020 Oral】小样本类增量学习

【CVPR 2020 Oral】小样本类增量学习

专知

20+阅读 · 2020年6月26日

iOS如何区分App和SDK内部crash

iOS如何区分App和SDK内部crash

CocoaChina

11+阅读 · 2019年4月17日

干货——图像分类（下）

干货——图像分类（下）

计算机视觉战队

14+阅读 · 2018年8月28日

论文笔记之Feature Selective Networks for Object Detection

论文笔记之Feature Selective Networks for Object Detection

统计学习与视觉计算组

21+阅读 · 2018年7月26日

误差反向传播——CNN

误差反向传播——CNN

统计学习与视觉计算组

31+阅读 · 2018年7月12日

CosFace: Large Margin Cosine Loss for Deep Face Recognition论文笔记

CosFace: Large Margin Cosine Loss for Deep Face Recognition论文笔记

统计学习与视觉计算组

44+阅读 · 2018年4月25日

读论文Discriminative Deep Metric Learning for Face and KV

读论文Discriminative Deep Metric Learning for Face and KV

统计学习与视觉计算组

12+阅读 · 2018年4月6日

论文浅尝 | Know-Evolve: Deep Temporal Reasoning for Dynamic KG

论文浅尝 | Know-Evolve: Deep Temporal Reasoning for Dynamic KG

开放知识图谱

36+阅读 · 2018年3月30日

LibRec 每周算法：DeepFM

LibRec 每周算法：DeepFM

LibRec智能推荐

14+阅读 · 2017年11月6日

相关论文

DeepSeek-V3 Technical Report

Arxiv

18+阅读 · 2024年12月27日

Is ChatGPT a Good Recommender? A Preliminary Study

Arxiv

175+阅读 · 2023年4月20日

NeuralField-LDM: Scene Generation with Hierarchical Latent Diffusion Models

Arxiv

42+阅读 · 2023年4月19日

A Comprehensive Survey on Deep Graph Representation Learning

Arxiv

109+阅读 · 2023年4月11日

A Survey of Large Language Models

A Survey of Large Language Models

Arxiv

499+阅读 · 2023年3月31日

Unleashing the Power of Edge-Cloud Generative AI in Mobile Networks: A Survey of AIGC Services

Arxiv

154+阅读 · 2023年3月29日

Knowledge Graphs: Opportunities and Challenges

Arxiv

181+阅读 · 2023年3月24日

Sparks of Artificial General Intelligence: Early experiments with GPT-4

Arxiv

51+阅读 · 2023年3月22日

A Complete Survey on Generative AI (AIGC): Is ChatGPT from GPT-4 to GPT-5 All You Need?

Arxiv

88+阅读 · 2023年3月21日

Data-centric Artificial Intelligence: A Survey

Arxiv

27+阅读 · 2023年3月17日

相关基金

结合知识图谱的概率话题模型研究

国家自然科学基金

10+阅读 · 2015年12月31日

分布式有监督学习的学习理论

国家自然科学基金

17+阅读 · 2015年12月31日

基于高效蒙特卡罗策略的最优化方法及应用研究

国家自然科学基金

9+阅读 · 2015年12月31日

社会网秘密共享中的关键问题研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

46+阅读 · 2015年12月31日

基于非对称群体兴趣相关性并融合情境与群体信任的Web服务推荐研究

国家自然科学基金

1+阅读 · 2015年12月31日

复杂多元数据的半参数统计推断

国家自然科学基金

5+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

基于第三方的APP软件质量度量和评估方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

开放动态环境下在线机器学习理论与方法

国家自然科学基金

11+阅读 · 2013年12月31日

微信扫码咨询专知VIP会员