Calibration in recommender systems is an important performance criterion that ensures consistency between the distribution of user preference categories and that of recommendations generated by the system. Standard methods for mitigating miscalibration typically assume that user preference profiles are static, and they measure calibration relative to the full history of user's interactions, including possibly outdated and stale preference categories. We conjecture that this approach can lead to recommendations that, while appearing calibrated, in fact, distort users' true preferences. In this paper, we conduct a preliminary investigation of recommendation calibration at a more granular level, taking into account evolving user preferences. By analyzing differently sized training time windows from the most recent interactions to the oldest, we identify the most relevant segment of user's preferences that optimizes the calibration metric. We perform an exploratory analysis with datasets from different domains with distinctive user-interaction characteristics. We demonstrate how the evolving nature of user preferences affects recommendation calibration, and how this effect is manifested differently depending on the characteristics of the data in a given domain. Datasets, codes, and more detailed experimental results are available at: https://github.com/nicolelin13/DynamicCalibrationUMAP.
翻译:推荐系统中的校准是一项重要的性能指标,它确保用户偏好类别的分布与系统生成推荐的分布之间的一致性。缓解失校准问题的标准方法通常假设用户偏好画像(profile)是静态的,并基于用户完整交互历史(包括可能过时和陈旧的偏好类别)测量校准效果。我们推测,这种方法可能导致推荐结果虽然在表面上看似校准,但实际上扭曲了用户的真实偏好。本文初步探讨了在更细粒度层面考虑用户偏好演变时的推荐校准问题。通过分析从最新交互到最早交互的不同大小训练时间窗口,我们识别出能够优化校准指标的最相关用户偏好片段。我们使用来自不同领域且具有独特用户交互特征的数据集进行了探索性分析,展示了用户偏好的演变特性如何影响推荐校准,以及这种影响如何根据特定领域数据特征以不同方式显现。数据集、代码及更详细的实验结果可参见:https://github.com/nicolelin13/DynamicCalibrationUMAP