Online Resource Allocation under Horizon Uncertainty

We study stochastic online resource allocation: a decision maker needs to allocate limited resources to stochastically-generated sequentially-arriving requests in order to maximize reward. At each time step, requests are drawn independently from a distribution that is unknown to the decision maker. Online resource allocation and its special cases have been studied extensively in the past, but prior results crucially and universally rely on the strong assumption that the total number of requests (the horizon) is known to the decision maker in advance. In many applications, such as revenue management and online advertising, the number of requests can vary widely because of fluctuations in demand or user traffic intensity. In this work, we develop online algorithms that are robust to horizon uncertainty. In sharp contrast to the known-horizon setting, no algorithm can achieve even a constant asymptotic competitive ratio that is independent of the horizon uncertainty. We introduce a novel generalization of dual mirror descent which allows the decision maker to specify a schedule of time-varying target consumption rates, and prove corresponding performance guarantees. We go on to give a fast algorithm for computing a schedule of target consumption rates that leads to near-optimal performance in the unknown-horizon setting. In particular, our competitive ratio attains the optimal rate of growth (up to logarithmic factors) as the horizon uncertainty grows large. Finally, we also provide a way to incorporate machine-learned predictions about the horizon which interpolates between the known and unknown horizon settings.

翻译：我们研究随机在线资源分配问题：决策者需要在有限资源约束下，对随机生成且顺序到达的请求进行分配以最大化收益。每个时间步长中，请求从决策者未知的分布中独立抽取。在线资源分配及其特例虽已得到广泛研究，但现有结果均关键且普遍依赖于一个强假设：请求总数（即时间范围）对决策者而言是预先已知的。在收益管理和在线广告等实际应用中，由于需求波动或用户流量强度的变化，请求数量可能存在较大差异。本文开发了对时间范围不确定性具有鲁棒性的在线算法。与已知时间范围场景形成鲜明对比的是，没有任何算法能在时间范围不确定时实现与时间范围不确定性无关的常数渐近竞争比。我们提出了一种对偶镜像下降的新推广形式，允许决策者指定时变目标消耗率的时间表，并证明了相应的性能保证。进一步地，我们给出了一种快速算法来计算目标消耗率时间表，该算法在未知时间范围场景下能达到接近最优的性能。特别地，当时间范围不确定性增大时，我们的竞争比达到最优增长率（对数因子范围内）。最后，我们还提供了融合机器学习对时间范围预测的方法，可在已知与未知时间范围场景之间实现平滑过渡。