A good number of toolkits have been developed in Recommender Systems (RecSys) research to promote fair evaluation and reproducibility. However, recent critical examinations of RecSys evaluation protocols have raised concerns regarding the validity of existing evaluation pipelines. In this demonstration, we present RecNextEval, a reference implementation of an evaluation framework specifically designed for next-batch recommendation. RecNextEval utilizes a time-window data split to ensure models are evaluated along a global timeline, effectively minimizing data leakage. Our implementation highlights the inherent complexities of RecSys evaluation and encourages a shift toward model development that more accurately simulates production environments. The RecNextEval library and its accompanying GUI interface are open-source and publicly accessible.
翻译:在推荐系统(RecSys)研究中,大量工具包被开发以促进公平评估与可复现性。然而,近期针对RecSys评估协议的批判性审查,对现有评估流程的有效性提出了质疑。本演示中,我们提出RecNextEval——一个专为下一次批次推荐设计的评估框架参考实现。该框架采用时间窗数据划分,确保模型沿全局时间线接受评估,有效降低数据泄露风险。我们的实现揭示了RecSys评估的内在复杂性,并推动模型开发向更贴近生产环境的仿真方向转变。RecNextEval库及其配套图形用户界面均已开源并公开可用。