Show Me the Whole World: Towards Entire Item Space Exploration for Interactive Personalized Recommendations

User interest exploration is an important and challenging topic in recommender systems, which alleviates the closed-loop effects between recommendation models and user-item interactions. Contextual bandit (CB) algorithms strive to make a good trade-off between exploration and exploitation so that users' potential interests have chances to expose. However, classical CB algorithms can only be applied to a small, sampled item set (usually hundreds), which forces the typical applications in recommender systems limited to candidate post-ranking, homepage top item ranking, ad creative selection, or online model selection (A/B test). In this paper, we introduce two simple but effective hierarchical CB algorithms to make a classical CB model (such as LinUCB and Thompson Sampling) capable to explore users' interest in the entire item space without limiting it to a small item set. We first construct a hierarchy item tree via a bottom-up clustering algorithm to organize items in a coarse-to-fine manner. Then we propose a hierarchical CB (HCB) algorithm to explore users' interest in the hierarchy tree. HCB takes the exploration problem as a series of decision-making processes, where the goal is to find a path from the root to a leaf node, and the feedback will be back-propagated to all the nodes in the path. We further propose a progressive hierarchical CB (pHCB) algorithm, which progressively extends visible nodes which reach a confidence level for exploration, to avoid misleading actions on upper-level nodes in the sequential decision-making process. Extensive experiments on two public recommendation datasets demonstrate the effectiveness and flexibility of our methods.

翻译：在推荐人系统中,用户的兴趣探索是一个重要且具有挑战性的议题,它减轻了建议模式和用户项目互动之间的闭路交易效应。背景盗匪算法(CB)努力在勘探和开发之间做出一个良好的交易,使用户的潜在利益有机会暴露。然而,传统的CB算法只能应用于一个小型的抽样项目集(通常为数百个),这迫使推荐人系统中的典型应用仅限于候选后级、主页顶级项目排名、创新选择或在线模式选择(A/B测试),在本文中,我们引入两个简单但有效的CB级算法(CB级算法)来制造一个典型的CB型模型(如LinCB和Thompson Sampling),能够探索用户在整个项目空间中的兴趣,而不会将它限制在小项组合中。我们首先通过一个底盘组合算法将项目组织起来,以粗略的方式将项目组织起来。然后我们提出一个等级CB级(HC级算法)算出用户对等级树的兴趣。HCB将探索问题作为典型的高级CB级(LUCB级)的系列,作为一系列的探索问题,从一个清晰的轨道到一个不伸缩的路径,我们不伸缩到一个直走向一个直向的路径。我们的一个目标到一个直路路到一个直到一个直到一个直径的路径。我们为直向的路径。我们不伸缩的路径,从一个直向的路径,从一个直路路路路。