Motivation-based recommendation systems uncover user behavior drivers. Motivation modeling, crucial for decision-making and content preference, explains recommendation generation. Existing methods often treat motivation as latent variables from interaction data, neglecting heterogeneous information like review text. In multimodal motivation fusion, two challenges arise: 1) achieving stable cross-modal alignment amid noise, and 2) identifying features reflecting the same underlying motivation across modalities. To address these, we propose LLM-driven Motivation-aware Multimodal Recommendation (LMMRec), a model-agnostic framework leveraging large language models for deep semantic priors and motivation understanding. LMMRec uses chain-of-thought prompting to extract fine-grained user and item motivations from text. A dual-encoder architecture models textual and interaction-based motivations for cross-modal alignment, while Motivation Coordination Strategy and Interaction-Text Correspondence Method mitigate noise and semantic drift through contrastive learning and momentum updates. Experiments on three datasets show LMMRec achieves up to a 4.98\% performance improvement.
翻译:基于动机的推荐系统旨在揭示用户行为的内在驱动力。动机建模对于理解决策过程与内容偏好至关重要,并为推荐生成提供解释。现有方法通常将动机视为从交互数据中推断的隐变量,忽视了评论文本等异构信息。在多模态动机融合中,存在两大挑战:1)如何在噪声干扰下实现稳定的跨模态对齐;2)如何识别不同模态中反映同一潜在动机的特征。为解决这些问题,我们提出基于大型语言模型的动机感知多模态推荐框架(LMMRec),该模型无关框架利用大型语言模型获取深层语义先验与动机理解能力。LMMRec采用思维链提示技术从文本中提取细粒度的用户与物品动机。通过双编码器架构分别建模文本动机与基于交互的动机以实现跨模态对齐,同时引入动机协调策略与交互-文本对应方法,通过对比学习与动量更新机制缓解噪声与语义漂移问题。在三个数据集上的实验表明,LMMRec最高可实现4.98%的性能提升。