The effectiveness of Recommender Systems (RS) is closely tied to the quality and distinctiveness of user profiles, yet despite many advancements in raw performance, the sensitivity of RS to user profile quality remains under-researched. This paper introduces novel information-theoretic measures for understanding recommender systems: a "surprise" measure quantifying users' deviations from popular choices, and a "conditional surprise" measure capturing user interaction coherence. We evaluate 7 recommendation algorithms across 9 datasets, revealing the relationships between our measures and standard performance metrics. Using a rigorous statistical framework, our analysis quantifies how much user profile density and information measures impact algorithm performance across domains. By segmenting users based on these measures, we achieve improved performance with reduced data and show that simpler algorithms can match complex ones for low-coherence users. Additionally, we employ our measures to analyze how well different recommendation algorithms maintain the coherence and diversity of user preferences in their predictions, providing insights into algorithm behavior. This work advances the theoretical understanding of user behavior and practical heuristics for personalized recommendation systems, promoting more efficient and adaptive architectures.
翻译:推荐系统(RS)的有效性与用户画像的质量和独特性密切相关,然而,尽管原始性能取得了诸多进展,推荐系统对用户画像质量的敏感性仍未得到充分研究。本文引入了新颖的信息论度量来理解推荐系统:一种量化用户偏离热门选择的“意外度”度量,以及一种捕捉用户交互一致性的“条件意外度”度量。我们在9个数据集上评估了7种推荐算法,揭示了这些度量与标准性能指标之间的关系。通过严格的统计框架,我们的分析量化了用户画像密度和信息度量如何影响跨领域算法性能。基于这些度量对用户进行细分后,我们以更少的数据实现了性能提升,并表明对于低一致性用户,更简单的算法可以匹配复杂算法的效果。此外,我们利用这些度量分析了不同推荐算法在预测中保持用户偏好一致性和多样性的能力,从而提供了对算法行为的深入洞察。这项工作推进了对用户行为的理论理解,并为个性化推荐系统提供了实用的启发式方法,促进了更高效和自适应的架构发展。