Text-based collaborative filtering (TCF) has become the mainstream approach for text and news recommendation, utilizing text encoders, also known as language models (LMs), to represent items. However, existing TCF models primarily focus on using small or medium-sized LMs. It remains uncertain what impact replacing the item encoder with one of the largest and most powerful LMs, such as the 175-billion parameter GPT-3 model, would have on recommendation performance. Can we expect unprecedented results? To this end, we conduct an extensive series of experiments aimed at exploring the performance limits of the TCF paradigm. Specifically, we increase the size of item encoders from one hundred million to one hundred billion to reveal the scaling limits of the TCF paradigm. We then examine whether these extremely large LMs could enable a universal item representation for the recommendation task. Furthermore, we compare the performance of the TCF paradigm utilizing the most powerful LMs to the currently dominant ID embedding-based paradigm and investigate the transferability of this TCF paradigm. Finally, we compare TCF with the recently popularized prompt-based recommendation using ChatGPT. Our research findings have not only yielded positive results but also uncovered some surprising and previously unknown negative outcomes, which can inspire deeper reflection and innovative thinking regarding text-based recommender systems. Codes and datasets will be released for further research.
翻译:文本协同过滤(TCF)已成为文本和新闻推荐的主流方法,它利用文本编码器(即语言模型(LM))来表示项目。然而,现有的TCF模型主要侧重于使用中小型语言模型。目前仍不确定,将项目编码器替换为最大、最强大的语言模型之一(例如1750亿参数的GPT-3模型)会对推荐性能产生何种影响。我们能否期待前所未有的结果?为此,我们进行了一系列广泛的实验,旨在探索TCF范式的性能极限。具体而言,我们将项目编码器的规模从一亿参数扩展到一千亿参数,以揭示TCF范式的缩放上限。随后,我们考察这些超大规模语言模型是否能够为推荐任务提供通用的项目表示。此外,我们将利用最强大语言模型的TCF范式与当前占主导地位的基于ID嵌入的范式进行性能对比,并研究该TCF范式的可迁移性。最后,我们将TCF与近期流行的基于提示的ChatGPT推荐方法进行比较。我们的研究结果不仅取得了积极成果,还揭示了一些令人惊讶且先前未知的负面结果,这将激发对基于文本的推荐系统的深度反思与创新思考。相关代码和数据集将在后续研究中公开。