Key challenges in running a retail business include how to select products to present to consumers (the assortment problem), and how to price products (the pricing problem) to maximize revenue or profit. Instead of considering these problems in isolation, we propose a joint approach to assortment-pricing based on contextual bandits. Our model is doubly high-dimensional, in that both context vectors and actions are allowed to take values in high-dimensional spaces. In order to circumvent the curse of dimensionality, we propose a simple yet flexible model that captures the interactions between covariates and actions via a (near) low-rank representation matrix. The resulting class of models is reasonably expressive while remaining interpretable through latent factors, and includes various structured linear bandit and pricing models as particular cases. We propose a computationally tractable procedure that combines an exploration/exploitation protocol with an efficient low-rank matrix estimator, and we prove bounds on its regret. Simulation results show that this method has lower regret than state-of-the-art methods applied to various standard bandit and pricing models. Real-world case studies on the assortment-pricing problem, from an industry-leading instant noodles company to an emerging beauty start-up, underscore the gains achievable using our method. In each case, we show at least three-fold gains in revenue or profit by our bandit method, as well as the interpretability of the latent factor models that are learned.
翻译:零售业运营的关键挑战包括如何选择向消费者展示的商品(组合选择问题)以及如何为商品定价(定价问题)以最大化收入或利润。我们不再孤立地考虑这些问题,而是基于情境组合选择提出了一种联合方法。该模型具有双高维性,即情境向量和动作都可以在高维空间中取值。为克服维度灾难,我们提出了一种简洁而灵活的模型,通过(近似)低秩表示矩阵捕捉协变量与动作之间的交互作用。由此产生的模型类在保持通过潜在因子可解释性的同时,具有相当强的表达能力,并包含多种结构化线性组合选择与定价模型作为特例。我们提出了一种计算可行的方法,将探索/利用协议与高效低秩矩阵估计器相结合,并证明其遗憾上界。仿真结果表明,该方法在应用于多种标准组合选择与定价模型时,其遗憾值低于当前最优方法。从行业领先的方便面公司到新兴的美容初创企业,针对组合选择与定价问题的真实世界案例研究凸显了使用我们的方法所能实现的收益。在每个案例中,我们的组合选择方法在收入或利润上均实现了至少三倍的提升,同时学习到的潜在因子模型也具备可解释性。