In recent years, substantial research has integrated multimodal item metadata into recommender systems, often by using pre-trained multimodal foundation models to encode such data. Since these models are not originally trained for recommendation tasks, recent works efficiently adapt them via parameter-efficient fine-tuning (PEFT). However, even with PEFT, item embeddings from multimodal foundation models remain user-blind: item embeddings are not conditioned on user interests, despite the fact that users with diverse interests attend to different item aspects. To address this limitation, we propose PerPEFT, a personalized PEFT strategy for multimodal recommendation. Specifically, PerPEFT groups users by interest and assigns a distinct PEFT module to each group, enabling each module to capture the fine-grained item aspects most predictive of that group`s purchase decisions. We further introduce a specialized training technique that strengthens this user-group conditioning. Notably, PerPEFT is PEFT-agnostic and can be paired with any PEFT method applicable to multimodal foundation models. Through extensive experiments, we show that (1) PerPEFT outperforms the strongest baseline by up to 15.3% (NDCG@20) and (2) delivers consistent gains across diverse PEFT variants. It is noteworthy that, even with personalization, PEFT remains lightweight, adding only 1.3% of the parameter count of the foundation model. We provide our code and datasets at https://github.com/kswoo97/PerPEFT.
翻译:近年来,大量研究通过预训练多模态基础模型编码多模态物品元数据,将其整合到推荐系统中。由于这些模型最初并非为推荐任务而训练,近期研究通过参数高效微调(PEFT)对其进行高效适配。然而,即使采用PEFT,来自多模态基础模型的物品嵌入仍存在用户盲区:物品嵌入未根据用户兴趣进行条件化处理,尽管兴趣各异的用户会关注物品的不同方面。为突破这一局限,我们提出PerPEFT——一种面向多模态推荐的个性化PEFT策略。具体而言,PerPEFT根据兴趣对用户进行分组,并为每个分组分配独立的PEFT模块,使每个模块能捕捉对该群体购买决策最具预测性的细粒度物品特征。我们进一步引入专项训练技术以强化这种用户群体条件化机制。值得注意的是,PerPEFT具有PEFT无关性,可与适用于多模态基础模型的任意PEFT方法结合使用。通过大量实验表明:(1)PerPEFT在NDCG@20指标上最高超越最强基线方法15.3%;(2)在不同PEFT变体中均能带来稳定性能提升。值得关注的是,即使引入个性化机制,PEFT仍保持轻量化特性,仅增加基础模型参数量的1.3%。我们在https://github.com/kswoo97/PerPEFT公开了代码与数据集。