Recommendation Systems have become integral to modern user experiences, but lack transparency in their decision-making processes. Existing explainable recommendation methods are hindered by reliance on a post-hoc paradigm, wherein explanation generators are trained independently of the underlying recommender models. This paradigm necessitates substantial human effort in data construction and raises concerns about explanation reliability. In this paper, we present ExpCTR, a novel framework that integrates large language model based explanation generation directly into the CTR prediction process. Inspired by recent advances in reinforcement learning, we employ two carefully designed reward mechanisms, LC alignment, which ensures explanations reflect user intentions, and IC alignment, which maintains consistency with traditional ID-based CTR models. Our approach incorporates an efficient training paradigm with LoRA and a three-stage iterative process. ExpCTR circumvents the need for extensive explanation datasets while fostering synergy between CTR prediction and explanation generation. Experimental results demonstrate that ExpCTR significantly enhances both recommendation accuracy and interpretability across three real-world datasets.
翻译:推荐系统已成为现代用户体验不可或缺的组成部分,但其决策过程缺乏透明度。现有的可解释推荐方法受限于对后处理范式的依赖,即解释生成器独立于底层推荐模型进行训练。这种范式需要大量人工进行数据构建,并引发了对解释可靠性的担忧。本文提出ExpCTR,一种将基于大语言模型的解释生成直接集成到点击率预测过程中的新颖框架。受强化学习最新进展的启发,我们采用两种精心设计的奖励机制:LC对齐确保解释反映用户意图,IC对齐保持与传统基于ID的点击率模型的一致性。我们的方法结合了使用LoRA的高效训练范式和一个三阶段迭代过程。ExpCTR避免了构建大规模解释数据集的需求,同时促进了点击率预测与解释生成之间的协同作用。实验结果表明,ExpCTR在三个真实世界数据集上显著提升了推荐准确性和可解释性。