Online linear programming (OLP) has found broad applications in revenue management and resource allocation. State-of-the-art OLP algorithms achieve low regret by repeatedly solving linear programming (LP) subproblems that incorporate updated resource information. However, LP-based methods are computationally expensive and often inefficient for large-scale applications. In contrast, recent first-order OLP algorithms are more computationally efficient but typically suffer from worse regret guarantees. To address these shortcomings, we propose a new algorithm that combines the strengths of LP-based and first-order OLP methods. The algorithm re-solves the LP subproblems periodically at a predefined frequency $f$ and uses the latest dual prices to guide online decision-making. In addition, a first-order method runs in parallel during each interval between LP re-solves, smoothing resource consumption. Our algorithm achieves $\mathscr{O}(\log (T/f) + \sqrt{f})$ regret, delivering a "wait-less" online decision-making process that balances the computational efficiency of first-order methods and the superior regret guarantee of LP-based methods.
翻译:在线线性规划(OLP)在收益管理和资源分配领域有着广泛的应用。最先进的OLP算法通过反复求解结合了更新后资源信息的线性规划(LP)子问题,实现了较低的遗憾值。然而,基于LP的方法计算成本高昂,在大规模应用中通常效率低下。相比之下,近期的一阶OLP算法计算效率更高,但其遗憾值保证通常较差。为了应对这些不足,我们提出了一种新算法,它结合了基于LP的方法和一阶OLP方法的优势。该算法以预定义的频率 $f$ 周期性地重新求解LP子问题,并使用最新的对偶价格来指导在线决策。此外,在每次LP重解之间的间隔期内,一个一阶方法并行运行,以平滑资源消耗。我们的算法实现了 $\mathscr{O}(\log (T/f) + \sqrt{f})$ 的遗憾值,提供了一个"等待更少"的在线决策过程,平衡了一阶方法的计算效率与基于LP方法的优越遗憾值保证。