In recommender systems, collecting, storing, and processing large-scale interaction data is increasingly costly in terms of time, energy, and computation, yet it remains unclear when additional data stops providing meaningful gains. This paper investigates how offline recommendation performance evolves as the size of the training dataset increases and whether a saturation point can be observed. We implemented a reproducible Python evaluation workflow with two established toolkits, LensKit and RecBole, included 11 large public datasets with at least 7 million interactions, and evaluated 10 tool-algorithm combinations. Using absolute stratified user sampling, we trained models on nine sample sizes from 100,000 to 100,000,000 interactions and measured NDCG@10. Overall, raw NDCG usually increased with sample size, with no observable saturation point. To make result groups comparable, we applied min-max normalization within each group, revealing a clear positive trend in which around 75% of the points at the largest completed sample size also achieved the group's best observed performance. A late-stage slope analysis over the final 10-30% of each group further supported this upward trend: the interquartile range remained entirely non-negative with a median near 1.0. In summary, for traditional recommender systems on typical user-item interaction data, incorporating more training data remains primarily beneficial, while weaker scaling behavior is concentrated in atypical dataset cases and in the algorithmic outlier RecBole BPR under our setup.
翻译:在推荐系统中,收集、存储和处理大规模交互数据在时间、能源和计算方面的成本日益增加,但目前尚不清楚何时额外数据不再产生有意义的增益。本文研究了离线推荐性能如何随训练数据集规模增加而变化,以及是否能观察到饱和点。我们使用两个成熟的工具包LensKit和RecBole,实现了一个可复现的Python评估工作流,涵盖了11个至少包含700万次交互的大型公开数据集,并评估了10种工具-算法组合。通过绝对分层用户采样,我们在从10万到1亿次交互的九个样本量上训练模型,并测量NDCG@10。总体而言,原始NDCG通常随样本量增加而提升,未观察到明显的饱和点。为了使结果组具有可比性,我们在每组内应用最小-最大归一化,显示出明显的正向趋势,其中约75%的最大完成样本量上的点也达到了该组的最佳观测性能。对每组最后10%-30%的后期阶段斜率分析进一步支持了这一上升趋势:四分位距完全非负,中位数接近1.0。总之,对于典型的用户-物品交互数据上的传统推荐系统,纳入更多训练数据仍然主要是有益的,而较弱的扩展行为集中在非典型数据集案例以及我们设置下的算法异常值RecBole BPR中。