The Network Revenue Management (NRM) problem is a well-known challenge in dynamic decision-making under uncertainty. In this problem, fixed resources must be allocated to serve customers over a finite horizon, while customers arrive according to a stochastic process. The typical NRM model assumes that customer arrivals are independent over time. However, in this paper, we explore a more general setting where customer arrivals over different periods can be correlated. We propose a model that assumes the existence of a system state, which determines customer arrivals for the current period. This system state evolves over time according to a time-inhomogeneous Markov chain. We show our model can be used to represent correlation in various settings. To solve the NRM problem under our correlated model, we derive a new linear programming (LP) approximation of the optimal policy. Our approximation provides an upper bound on the total expected value collected by the optimal policy. We use our LP to develop a new bid price policy, which computes bid prices for each system state and time period in a backward induction manner. The decision is then made by comparing the reward of the customer against the associated bid prices. Our policy guarantees to collect at least $1/(1+L)$ fraction of the total reward collected by the optimal policy, where $L$ denotes the maximum number of resources required by a customer. In summary, our work presents a Markovian model for correlated customer arrivals in the NRM problem and provides a new LP approximation for solving the problem under this model. We derive a new bid price policy and provides a theoretical guarantee of the performance of the policy.
翻译:网络收益管理(NRM)问题是在不确定性下进行动态决策中的一个著名挑战。在此问题中,有限资源必须在有限的时间范围内分配给顾客,而顾客则根据随机过程到达。典型的NRM模型假设顾客到达在时间上是独立的。然而,在本文中,我们探讨了一个更一般的设定,其中不同时期的顾客到达可能是相关的。我们提出了一个模型,该模型假设存在一个系统状态,它决定了当前时期的顾客到达。该系统状态根据非齐次马尔可夫链随时间演化。我们展示了我们的模型可用于表示各种设定下的相关性。为了解决在我们相关模型下的NRM问题,我们推导出了最优策略的一个新的线性规划(LP)近似。我们的近似为最优策略收集的总期望值提供了一个上界。我们利用LP开发了一种新的投标价格策略,该策略以逆向归纳的方式为每个系统状态和时间周期计算投标价格。然后通过将顾客的奖励与相关的投标价格进行比较来做出决策。我们的策略保证至少收集最优策略收集的总奖励的$1/(1+L)$部分,其中$L$表示顾客所需的最大资源数量。总之,我们的工作为NRM问题中的相关顾客到达提出了一个马尔可夫模型,并提供了在该模型下解决问题的新LP近似。我们推导了一种新的投标价格策略,并给出了该策略性能的理论保证。