Collaborative filtering (CF) recommender systems struggle with making predictions on unseen, or 'cold', items. Systems designed to address this challenge are often trained with supervision from warm CF models in order to leverage collaborative and content information from the available interaction data. However, since they learn to replicate the behavior of CF methods, cold-start models may therefore also learn to imitate their predictive biases. In this paper, we show that cold-start systems can inherit popularity bias, a common cause of recommender system unfairness arising when CF models overfit to more popular items, thereby maximizing user-oriented accuracy but neglecting rarer items. We demonstrate that cold-start recommenders not only mirror the popularity biases of warm models, but are in fact affected more severely: because they cannot infer popularity from interaction data, they instead attempt to estimate it based solely on content features. This leads to significant over-prediction of certain cold items with similar content to popular warm items, even if their ground truth popularity is very low. Through experiments on three multimedia datasets, we analyze the impact of this behavior on three generative cold-start methods. We then describe a simple post-processing bias mitigation method that, by using embedding magnitude as a proxy for predicted popularity, can produce more balanced recommendations with limited harm to user-oriented cold-start accuracy.
翻译:协同过滤(CF)推荐系统在预测未见(即“冷”)物品时面临挑战。为解决此问题而设计的系统通常通过热CF模型的监督进行训练,以利用可用交互数据中的协同信息和内容信息。然而,由于这类冷启动模型学习模仿CF方法的行为,它们也可能随之习得其预测偏差。本文证明,冷启动系统可能继承流行度偏差——这是推荐系统不公平性的常见成因,源于CF模型对热门物品的过拟合,从而在最大化面向用户的准确率时忽视了稀有物品。我们通过实验表明,冷启动推荐器不仅会复现热模型的流行度偏差,而且实际上受到的影响更为严重:由于无法从交互数据推断流行度,它们转而仅基于内容特征进行估计。这导致某些内容与热门热物品相似的冷物品被显著高估,即使其真实流行度极低。通过在三个多媒体数据集上的实验,我们分析了该现象对三种生成式冷启动方法的影响。随后,我们提出一种简单的后处理偏差缓解方法:通过使用嵌入向量模长作为预测流行度的代理指标,能够在有限损害面向用户的冷启动准确率的前提下,生成更平衡的推荐结果。