As e-commerce expands, delivering real-time personalized recommendations from vast catalogs poses a critical challenge for retail platforms. Maximizing revenue requires careful consideration of both individual customer characteristics and available item features to optimize assortments over time. In this paper, we consider the dynamic assortment problem with dual contexts -- user and item features. In high-dimensional scenarios, the quadratic growth of dimensions complicates computation and estimation. To tackle this challenge, we introduce a new low-rank dynamic assortment model to transform this problem into a manageable scale. Then we propose an efficient algorithm that estimates the intrinsic subspaces and utilizes the upper confidence bound approach to address the exploration-exploitation trade-off in online decision making. Theoretically, we establish a regret bound of $\tilde{O}((d_1+d_2)r\sqrt{T})$, where $d_1, d_2$ represent the dimensions of the user and item features respectively, $r$ is the rank of the parameter matrix, and $T$ denotes the time horizon. This bound represents a substantial improvement over prior literature, made possible by leveraging the low-rank structure. Extensive simulations and an application to the Expedia hotel recommendation dataset further demonstrate the advantages of our proposed method.
翻译:随着电子商务的扩展,从海量商品目录中提供实时个性化推荐成为零售平台面临的重大挑战。最大化收入需要同时考虑个体客户特征和可用商品特征,以随时间优化品类组合。本文考虑具有双上下文(用户特征与商品特征)的动态品类优化问题。在高维场景中,维度的二次增长使得计算与估计变得复杂。为应对这一挑战,我们引入一种新颖的低秩动态品类模型,将问题转化为可控规模。随后我们提出一种高效算法,通过估计内在子空间并利用上置信界方法来解决在线决策中的探索-利用权衡问题。理论上,我们建立了$\tilde{O}((d_1+d_2)r\sqrt{T})$的遗憾界,其中$d_1, d_2$分别表示用户与商品特征的维度,$r$为参数矩阵的秩,$T$为时间周期。该界通过利用低秩结构实现了对既有文献的显著改进。基于Expedia酒店推荐数据集的广泛模拟与应用进一步验证了所提方法的优势。