Decision-makers often have access to a machine-learned prediction about demand, referred to as advice, which can potentially be utilized in online decision-making processes for resource allocation. However, exploiting such advice poses challenges due to its potential inaccuracy. To address this issue, we propose a framework that enhances online resource allocation decisions with potentially unreliable machine-learned (ML) advice. We assume here that this advice is represented by a general convex uncertainty set for the demand vector. We introduce a parameterized class of Pareto optimal online resource allocation algorithms that strike a balance between consistent and robust ratios. The consistent ratio measures the algorithm's performance (compared to the optimal hindsight solution) when the ML advice is accurate, while the robust ratio captures performance under an adversarial demand process when the advice is inaccurate. Specifically, in a C-Pareto optimal setting, we maximize the robust ratio while ensuring that the consistent ratio is at least C. Our proposed C-Pareto optimal algorithm is an adaptive protection level algorithm, which extends the classical fixed protection level algorithm introduced in Littlewood (2005) and Ball and Queyranne (2009). Solving a complex non-convex continuous optimization problem characterizes the adaptive protection level algorithm. To complement our algorithms, we present a simple method for computing the maximum achievable consistent ratio, which serves as an estimate for the maximum value of the ML advice. Additionally, we present numerical studies to evaluate the performance of our algorithm in comparison to benchmark algorithms. The results demonstrate that by adjusting the parameter C, our algorithms effectively strike a balance between worst-case and average performance, outperforming the benchmark algorithms.
翻译:决策者通常可以获得关于需求的机器学习预测(称为建议),这些建议可潜在地应用于在线资源分配决策过程中。然而,利用此类建议因其潜在的不准确性而面临挑战。为解决这一问题,我们提出一个框架,通过结合可能不可靠的机器学习建议来增强在线资源分配决策。我们假设该建议由需求向量的一个通用凸不确定集表示。我们引入一类参数化的帕累托最优在线资源分配算法,在一致比率和鲁棒比率之间取得平衡。一致比率衡量当机器学习建议准确时算法的性能(与最优事后方案相比),而鲁棒比率则捕捉当建议不准确时在对抗性需求过程下的性能。具体而言,在C-帕累托最优设置中,我们在确保一致比率至少为C的同时最大化鲁棒比率。我们提出的C-帕累托最优算法是一种自适应保护水平算法,它扩展了Littlewood (2005)及Ball和Queyranne (2009)提出的经典固定保护水平算法。求解一个复杂的非凸连续优化问题表征了该自适应保护水平算法。为补充我们的算法,我们提出一种计算最大可实现一致比率的简单方法,该比率可作为机器学习建议最大值的估计。此外,我们通过数值研究评估算法与基准算法相比的性能。结果表明,通过调整参数C,我们的算法有效平衡了最坏情况和平均性能,优于基准算法。