In this paper, we consider a multi-stage dynamic assortment optimization problem with multi-nomial choice modeling (MNL) under resource knapsack constraints. Given the current resource inventory levels, the retailer makes an assortment decision at each period, and the goal of the retailer is to maximize the total profit from purchases. With the exact optimal dynamic assortment solution being computationally intractable, a practical strategy is to adopt the re-solving technique that periodically re-optimizes deterministic linear programs (LP) arising from fluid approximation. However, the fractional structure of MNL makes the fluid approximation in assortment optimization highly non-linear, which brings new technical challenges. To address this challenge, we propose a new epoch-based re-solving algorithm that effectively transforms the denominator of the objective into the constraint. Theoretically, we prove that the regret (i.e., the gap between the resolving policy and the optimal objective of the fluid approximation) scales logarithmically with the length of time horizon and resource capacities.
翻译:本文研究资源背包约束下基于多项Logit选择模型的多阶段动态品类优化问题。零售商根据当前资源库存水平,在每个周期做出品类决策,其目标是通过销售实现总利润最大化。由于精确最优动态品类解在计算上难以处理,实际策略通常采用重解技术——即定期对基于流体近似导出的确定性线性规划进行重新优化。然而,MNL模型的分数结构使得品类优化中的流体近似呈现高度非线性,这带来了新的技术挑战。为应对这一挑战,我们提出一种基于时间段的重解算法,该算法能有效将目标函数的分母转化为约束条件。在理论层面,我们证明了该重解策略的遗憾值(即重解策略与流体近似最优目标之间的差距)随时间跨度和资源容量的增长呈对数尺度变化。