Emerging mobility systems are increasingly capable of recommending options to mobility users, to guide them towards personalized yet sustainable system outcomes. Even more so than the typical recommendation system, it is crucial to minimize regret, because 1) the mobility options directly affect the lives of the users, and 2) the system sustainability relies on sufficient user participation. In this study, we consider accelerating user preference learning by exploiting a low-dimensional latent space that captures the mobility preferences of users. We introduce a hierarchical contextual bandit framework named Expert with Clustering (EWC), which integrates clustering techniques and prediction with expert advice. EWC efficiently utilizes hierarchical user information and incorporates a novel Loss-guided Distance metric. This metric is instrumental in generating more representative cluster centroids. In a recommendation scenario with $N$ users, $T$ rounds per user, and $K$ options, our algorithm achieves a regret bound of $O(N\sqrt{T\log K} + NT)$. This bound consists of two parts: the first term is the regret from the Hedge algorithm, and the second term depends on the average loss from clustering. The algorithm performs with low regret, especially when a latent hierarchical structure exists among users. This regret bound underscores the theoretical and experimental efficacy of EWC, particularly in scenarios that demand rapid learning and adaptation. Experimental results highlight that EWC can substantially reduce regret by 27.57% compared to the LinUCB baseline. Our work offers a data-efficient approach to capturing both individual and collective behaviors, making it highly applicable to contexts with hierarchical structures. We expect the algorithm to be applicable to other settings with layered nuances of user preferences and information.
翻译:[translated abstract in Chinese]
新兴移动出行系统日益具备向用户推荐选项的能力,以引导其实现个性化且可持续的系统效果。相较于典型推荐系统,最小化遗憾(regret)至关重要,原因有二:1)出行选项直接影响用户生活;2)系统可持续性依赖于充分的用户参与。本研究通过利用捕捉用户出行偏好的低维潜在空间,加速用户偏好学习过程。我们提出一种名为"基于聚类的专家模型"(Expert with Clustering, EWC)的层次化上下文赌博机框架,该框架融合聚类技术与专家建议预测方法。EWC高效利用层次化用户信息,并引入新型损失引导距离度量(Loss-guided Distance metric),该度量有助于生成更具代表性的聚类中心。在包含N个用户、每个用户T轮交互及K个选项的推荐场景中,我们的算法达到O(N√(T log K) + NT)的遗憾界。该界限由两部分组成:第一项为Hedge算法产生的遗憾,第二项取决于聚类的平均损失。当用户之间存在潜在层次结构时,算法表现出低遗憾特性。该遗憾界充分证明了EWC在需要快速学习与适应的场景中的理论及实验有效性。实验结果表明,与LinUCB基线相比,EWC能显著降低27.57%的遗憾。本研究提供了一种数据高效的方法来捕捉个体与集体行为,使其高度适用于具有层次结构的应用场景。我们预期该算法可推广至其他具有用户偏好与信息层次化特征的设置中。