Cold-start recommendation remains a central challenge in dynamic, open-world platforms, requiring models to recommend for newly registered users (user cold-start) and to recommend newly introduced items to existing users (item cold-start) under sparse or missing interaction signals. Recent generative recommenders built on pre-trained language models (PLMs) are often expected to mitigate cold-start by using item semantic information (e.g., titles and descriptions) and test-time conditioning on limited user context. However, cold-start is rarely treated as a primary evaluation setting in existing studies, and reported gains are difficult to interpret because key design choices, such as model scale, identifier design, and training strategy, are frequently changed together. In this work, we present a systematic reproducibility study of generative recommendation under a unified suite of cold-start protocols.
翻译:冷启动推荐仍是动态、开放世界平台的核心挑战,要求模型在稀疏或缺失交互信号条件下,为新增用户(用户冷启动)推荐内容,并向现有用户推荐新引入商品(商品冷启动)。基于预训练语言模型(PLMs)的近期生成式推荐系统,通常被认为可通过利用商品语义信息(如标题与描述)及基于有限用户上下文的测试时条件约束来缓解冷启动。然而,现有研究鲜少将冷启动作为主要评估场景,且报告的性能提升难以解读,因为模型规模、标识符设计及训练策略等关键设计选择常被同步变更。本研究在统一冷启动协议框架下,对生成式推荐进行了系统的可复现性研究。