DRDT: Dynamic Reflection with Divergent Thinking for LLM-based Sequential Recommendation

The rise of Large Language Models (LLMs) has sparked interest in their application to sequential recommendation tasks as they can provide supportive item information. However, due to the inherent complexities of sequential recommendation, such as sequential patterns across datasets, noise within sequences, and the temporal evolution of user preferences, existing LLM reasoning strategies, such as in-context learning and chain-of-thought are not fully effective. To address these challenges, we introduce a novel reasoning principle: Dynamic Reflection with Divergent Thinking within a retriever-reranker framework. Our approach starts with a collaborative in-context demonstration retriever, which collects sequences exhibiting collaborative behaviors as in-context examples. Following this, we abstract high-level user preferences across multiple aspects, providing a more nuanced understanding of user interests and circumventing the noise within the raw sequences. The cornerstone of our methodology is dynamic reflection, a process that emulates human learning through probing, critiquing, and reflecting, using user feedback to tailor the analysis more effectively to the target user in a temporal manner. We evaluate our approach on three datasets using six pre-trained LLMs. The superior performance observed across these models demonstrates the efficacy of our reasoning strategy, notably achieved without the need to fine-tune the LLMs. With our principle, we managed to outperform GPT-Turbo-3.5 on three datasets using 7b models e.g., Vicuna-7b and Openchat-7b on NDCG@10. This research not only highlights the potential of LLMs in enhancing sequential recommendation systems but also underscores the importance of developing tailored reasoning strategies to fully harness their capabilities.

翻译：大语言模型（LLMs）的兴起激发了其在序列推荐任务中的应用兴趣，因为它们能够提供支持性物品信息。然而，由于序列推荐的固有复杂性（例如跨数据集的序列模式、序列内的噪声以及用户偏好的时间演化），现有的LLM推理策略（如上下文学习和思维链）未能完全奏效。为应对这些挑战，我们提出了一种新颖的推理原则：在检索器-重排序框架中引入基于发散思维的动态反思方法。该方法首先构建协同上下文演示检索器，收集展现协同行为的序列作为上下文示例。在此基础上，我们从多维度抽象用户的高层次偏好，从而更细致地理解用户兴趣，并规避原始序列中的噪声。我们方法论的核心是动态反思——一种通过探测、批评与反思模拟人类学习的过程，利用用户反馈以时序方式更有效地为目标用户定制分析。我们在三个数据集上使用六个预训练LLM评估了该方法。这些模型展现的优越性能证明了我们推理策略的有效性，且无需微调LLM即可实现。借助该原则，我们采用7B参数模型（如Vicuna-7b和Openchat-7b）在NDCG@10指标上超越了GPT-Turbo-3.5在三个数据集上的表现。本研究不仅凸显了LLM在增强序列推荐系统方面的潜力，更强调了开发定制化推理策略以充分发挥其能力的重要性。