MMGRid: Navigating Temporal-aware and Cross-domain Generative Recommendation via Model Merging

Model merging (MM) offers an efficient mechanism for integrating multiple specialized models without access to original training data or costly retraining. While MM has demonstrated success in domains like computer vision, its role in recommender systems (RSs) remains largely unexplored. Recently, Generative Recommendation (GR) has emerged as a new paradigm in RSs, characterized by rapidly growing model scales and substantial computational costs, making MM particularly appealing for cost-sensitive deployment scenarios. In this work, we present the first systematic study of MM in GR through a contextual lens. We focus on a fundamental yet underexplored challenge in real-world: how to merge generative recommenders specialized to different real-world contexts, arising from temporal evolving user behaviors and heterogeneous application domains. To this end, we propose a unified framework MMGRid, a structured contextual grid of GR checkpoints that organizes models trained under diverse contexts induced by temporal evolution and domain diversity. All checkpoints are derived from a shared base LLM but fine-tuned on context-specific data, forming a realistic and controlled model space for systematically analyzing MM across GR paradigms and merging algorithms. Our investigation reveals several key insights. First, training GR models from LLMs can introduce parameter conflicts during merging due to token distribution shifts and objective disparities; such conflicts can be alleviated by disentangling task-aware and context-specific parameter changes via base model replacement. Second, incremental training across contexts induces recency bias, which can be effectively balanced through weighted contextual merging. Notably, we observe that optimal merging weights correlate with context-dependent interaction characteristics, offering practical guidance for weight selection in real-world deployments.

翻译：模型合并（MM）提供了一种高效机制，可在无需访问原始训练数据或进行昂贵重训练的情况下整合多个专用模型。尽管MM在计算机视觉等领域已取得显著成功，其在推荐系统（RS）中的作用仍基本未被探索。近年来，生成式推荐（GR）已成为推荐系统的新范式，其模型规模快速增长且计算成本高昂，使得MM在成本敏感型部署场景中尤为具有吸引力。本研究首次通过情境化视角系统性地探讨MM在GR中的应用。我们聚焦于现实世界中一个基础但尚未被充分探索的挑战：如何合并针对不同现实情境（由用户行为的时序演变和异构应用领域所引发）专门优化的生成式推荐模型。为此，我们提出了统一框架MMGRid——一个结构化的GR检查点情境网格，用于组织在时序演变和领域多样性所诱导的多样化情境下训练的模型。所有检查点均源自共享的基础大语言模型（LLM），但在特定情境数据上进行微调，从而构建了一个现实且受控的模型空间，用于系统分析跨GR范式与合并算法的MM技术。我们的研究揭示了若干关键发现：首先，基于LLM训练GR模型可能因词元分布偏移和目标差异在合并过程中引发参数冲突；通过基础模型替换解耦任务感知与情境特定参数变化可缓解此类冲突。其次，跨情境的增量训练会引发近因偏差，而加权情境合并能有效平衡该偏差。值得注意的是，我们发现最优合并权重与情境依赖的交互特征相关，这为实际部署中的权重选择提供了实用指导。