Online optimization with memory costs has many real-world applications, where sequential actions are made without knowing the future input. Nonetheless, the memory cost couples the actions over time, adding substantial challenges. Conventionally, this problem has been approached by various expert-designed online algorithms with the goal of achieving bounded worst-case competitive ratios, but the resulting average performance is often unsatisfactory. On the other hand, emerging machine learning (ML) based optimizers can improve the average performance, but suffer from the lack of worst-case performance robustness. In this paper, we propose a novel expert-robustified learning (ERL) approach, achieving {both} good average performance and robustness. More concretely, for robustness, ERL introduces a novel projection operator that robustifies ML actions by utilizing an expert online algorithm; for average performance, ERL trains the ML optimizer based on a recurrent architecture by explicitly considering downstream expert robustification. We prove that, for any $\lambda\geq1$, ERL can achieve $\lambda$-competitive against the expert algorithm and $\lambda\cdot C$-competitive against the optimal offline algorithm (where $C$ is the expert's competitive ratio). Additionally, we extend our analysis to a novel setting of multi-step memory costs. Finally, our analysis is supported by empirical experiments for an energy scheduling application.
翻译:带有记忆成本的在线优化在许多实际应用中存在,其中在不知道未来输入的情况下做出顺序决策。然而,记忆成本将动作随时间耦合,增加了显著挑战。传统上,这一问题通过多种专家设计的在线算法来解决,目标是实现有界的最坏情况竞争比,但由此得到的平均性能往往不尽如人意。另一方面,新兴的基于机器学习(ML)的优化器可以改善平均性能,但缺乏最坏情况性能的鲁棒性。在本文中,我们提出了一种新颖的专家鲁棒化学习(ERL)方法,同时实现了良好的平均性能和鲁棒性。更具体地说,在鲁棒性方面,ERL引入了一种新颖的投影算子,通过利用专家在线算法来鲁棒化ML动作;在平均性能方面,ERL基于循环架构训练ML优化器,并显式考虑下游的专家鲁棒化。我们证明,对于任意$\lambda\geq1$,ERL能够相对于专家算法实现$\lambda$-竞争,并且相对于最优离线算法实现$\lambda\cdot C$-竞争(其中$C$是专家的竞争比)。此外,我们将分析扩展到多步记忆成本的新颖设置。最后,我们的分析得到了能源调度应用中的实证实验支持。