Explainable recommendation systems (RSs) are designed to explicitly uncover the rationale of each recommendation, thereby enhancing the transparency and credibility of RSs. Previous methods often jointly predicted ratings and generated explanations, but overlooked the incoherence of such two objectives. To address this issue, we propose Curr-RLCER, a reinforcement learning framework for explanation coherent recommendation with dynamic rating alignment. It employs curriculum learning, transitioning from basic predictions (i.e., click through rating-CTR, selection-based rating) to open-ended recommendation explanation generation. In particular, the rewards of each stage are designed for progressively enhancing the stability of RSs. Furthermore, a coherence-driven reward mechanism is also proposed to enforce the coherence between generated explanations and predicted ratings, supported by a specifically designed evaluation scheme. The extensive experimental results on three explainable recommendation datasets indicate that the proposed framework is effective. Codes and datasets are available at https://github.com/pxcstart/Curr-RLCER.
翻译:可解释推荐系统旨在显式揭示每条推荐的理由,从而提升推荐系统的透明性和可信度。现有方法通常联合预测评分并生成解释,但忽略了这两个目标之间的不连贯性。为解决该问题,我们提出Curr-RLCER——一种面向解释连贯推荐并具备动态评分对齐的强化学习框架。该框架采用课程学习策略,从基础预测(如点击率评分、基于选择的评分)逐步过渡到开放式的推荐解释生成。具体地,每个阶段的奖励被设计用于渐进增强推荐系统的稳定性。此外,我们还提出一种连贯性驱动的奖励机制,通过专门设计的评估方案,强制生成的解释与预测评分之间保持连贯性。在三个可解释推荐数据集上的大量实验结果表明,所提出的框架是有效的。代码和数据集见:https://github.com/pxcstart/Curr-RLCER。