The importance of recommender systems is growing rapidly due to the exponential increase in the volume of content generated daily. This surge in content presents unique challenges for designing effective recommender systems. Key among these challenges is the need to effectively leverage the vast amounts of natural language data and images that represent user preferences. This paper presents a novel approach to enhancing recommender systems by leveraging Large Language Models (LLMs) and deep learning techniques. The proposed framework aims to improve the accuracy and relevance of recommendations by incorporating multi-modal information processing and by the use of unified latent space representation. The study explores the potential of LLMs to better understand and utilize natural language data in recommendation contexts, addressing the limitations of previous methods. The framework efficiently extracts and integrates text and image information through LLMs, unifying diverse modalities in a latent space to simplify the learning process for the ranking model. Experimental results demonstrate the enhanced discriminative power of the model when utilizing multi-modal information. This research contributes to the evolving field of recommender systems by showcasing the potential of LLMs and multi-modal data integration to create more personalized and contextually relevant recommendations.
翻译:随着每日生成内容量的指数级增长,推荐系统的重要性正迅速提升。内容激增为设计有效的推荐系统带来了独特挑战,其中关键挑战在于如何有效利用代表用户偏好的海量自然语言数据和图像。本文提出一种利用大语言模型(LLMs)与深度学习技术增强推荐系统的新方法。该框架通过融入多模态信息处理及采用统一潜在空间表征,旨在提升推荐的准确性与相关性。研究探索了LLMs在推荐场景中更好理解与利用自然语言数据的潜力,以解决现有方法的局限性。该框架通过LLMs高效提取并整合文本与图像信息,在潜在空间中统一多模态数据,从而简化排序模型的学习过程。实验结果表明,当采用多模态信息时,模型的判别能力显著增强。本研究通过展示LLMs与多模态数据整合在创建更个性化、情境相关推荐方面的潜力,为推荐系统领域的发展作出了贡献。