LLM-based explainable recommenders can produce fluent explanations that are factually correct, yet still justify items using attributes that conflict with a user's historical preferences. Such preference-inconsistent explanations yield logically valid but unconvincing reasoning and are largely missed by standard hallucination or faithfulness metrics. We formalize this failure mode and propose PURE, a preference-aware reasoning framework following a select-then-generate paradigm. Instead of only improving generation, PURE intervenes in evidence selection, it selects a compact set of multi-hop item-centric reasoning paths that are both factually grounded and aligned with user preference structure, guided by user intent, specificity, and diversity to suppress generic, weakly personalized evidence. The selected evidence is then injected into LLM generation via structure-aware prompting that preserves relational constraints. To measure preference inconsistency, we introduce a feature-level, user-centric evaluation metric that reveals misalignment overlooked by factuality-based measures. Experiments on three real-world datasets show that PURE consistently reduces preference-inconsistent explanations and factual hallucinations while maintaining competitive recommendation accuracy, explanation quality, and inference efficiency. These results highlight that trustworthy explanations require not only factual correctness but also justification aligned with user preferences.
翻译:基于大语言模型的可解释推荐系统能够生成事实正确的流畅解释,但仍可能使用与用户历史偏好相冲突的属性来论证推荐项目。此类偏好不一致的解释会产生逻辑有效但缺乏说服力的推理,且标准幻觉或忠实度指标大多无法检测。我们形式化了这一失效模式,并提出PURE——一种遵循"选择-生成"范式的偏好感知推理框架。PURE不仅改进生成过程,更干预证据选择环节:在用户意图、特异性和多样性引导下,选择一组紧凑的多跳项目中心推理路径,这些路径既基于事实依据,又与用户偏好结构保持一致,从而抑制通用化、弱个性化的证据。所选证据随后通过保留关系约束的结构感知提示注入大语言模型生成过程。为度量偏好不一致性,我们提出特征级、以用户为中心的评价指标,揭示基于事实性度量所忽视的错位问题。在三个真实数据集上的实验表明,PURE能持续减少偏好不一致解释和事实幻觉,同时保持有竞争力的推荐准确性、解释质量和推理效率。这些结果凸显可信解释不仅需要事实正确性,还必须提供与用户偏好一致的论证依据。