Explanations are crucial for improving users' transparency, persuasiveness, engagement, and trust in Recommender Systems (RSs) by connecting interacted items to recommended items based on shared attributes. However, evaluating the effectiveness of explanation algorithms regarding those goals offline remains challenging due to their subjectiveness. This paper investigates the impact of user-level explanation properties, such as diversity and popularity of attributes, on the user perception of explanation goals. In an offline setting, we used metrics adapted from ranking to evaluate the characteristics of explanations generated by three state-of-the-art post-hoc explanation algorithms, based on the items and properties used to form the explanation sentence, across six recommendation systems. We compared the offline metrics results with those of an online user study. The findings highlight a trade-off between the goals of transparency and trust, which are related to popular properties, and the goals of engagement and persuasiveness, which are associated with the diversification of properties displayed to users. Furthermore, the study contributes to developing more robust evaluation methods for explanation algorithms in RSs.
翻译:解释通过基于共享属性将交互项目与推荐项目相连接,对于提升用户在推荐系统中的透明度、说服力、参与度和信任度至关重要。然而,由于这些目标的主观性,离线评估解释算法在这些目标上的有效性仍然具有挑战性。本文研究了用户层面解释属性(如属性的多样性和流行度)对用户感知解释目标的影响。在离线设置中,我们采用从排序任务中调整的指标,基于构成解释句所使用的项目和属性,评估了三种先进事后解释算法在六个推荐系统上生成解释的特征。我们将离线指标结果与在线用户研究的结果进行了比较。研究结果揭示了透明度与信任目标(与流行属性相关)同参与度与说服力目标(与向用户展示的属性多样化相关)之间的权衡关系。此外,本研究有助于为推荐系统中的解释算法开发更稳健的评估方法。