Improved Regret Guarantees for Online Mirror Descent using a Portfolio of Mirror Maps

OMD and its variants give a flexible framework for OCO where the performance depends crucially on the choice of the mirror map. While the geometries underlying OPGD and OEG, both special cases of OMD, are well understood, it remains a challenging open question on how to construct an optimal mirror map for any given constrained set and a general family of loss functions, e.g., sparse losses. Motivated by parameterizing a near-optimal set of mirror maps, we consider a simpler question: is it even possible to obtain polynomial gains in regret by using mirror maps for geometries that interpolate between $L_1$ and $L_2$, which may not be possible by restricting to only OEG ($L_1$) or OPGD ($L_2$). Our main result answers this question positively. We show that mirror maps based on block norms adapt better to the sparsity of loss functions, compared to previous $L_p$ (for $p \in [1, 2]$) interpolations. In particular, we construct a family of online convex optimization instances in $\mathbb{R}^d$, where block norm-based mirror maps achieve a provable polynomial (in $d$) improvement in regret over OEG and OPGD for sparse loss functions. We then turn to the setting in which the sparsity level of the loss functions is unknown. In this case, the choice of geometry itself becomes an online decision problem. We first show that naively switching between OEG and OPGD can incur linear regret, highlighting the intrinsic difficulty of geometry selection. To overcome this issue, we propose a meta-algorithm based on multiplicative weights that dynamically selects among a family of uniform block norms. We show that this approach effectively tunes OMD to the sparsity of the losses, yielding adaptive regret guarantees. Overall, our results demonstrate that online mirror-map selection can significantly enhance the ability of OMD to exploit sparsity in online convex optimization.

翻译：在线镜像下降（OMD）及其变体为在线凸优化（OCO）提供了一个灵活的框架，其性能关键取决于镜像映射的选择。虽然作为OMD特例的在线投影梯度下降（OPGD）和在线熵梯度下降（OEG）所对应的几何结构已被充分理解，但如何针对任意给定的约束集和一般损失函数族（例如稀疏损失）构造最优镜像映射，仍然是一个具有挑战性的开放性问题。受参数化一组近似最优镜像映射的启发，我们考虑一个更简单的问题：通过使用在$L_1$和$L_2$几何之间插值的镜像映射（这可能无法通过仅限制于OEG（$L_1$）或OPGD（$L_2$）实现），是否有可能在遗憾上获得多项式级别的增益？我们的主要结果对此给出了肯定的回答。我们证明，与先前的$L_p$（$p \in [1, 2]$）插值方法相比，基于块范数的镜像映射能更好地适应损失函数的稀疏性。具体而言，我们在$\mathbb{R}^d$中构造了一个在线凸优化实例族，其中基于块范数的镜像映射对于稀疏损失函数，在遗憾上相比OEG和OPGD实现了可证明的多项式（关于$d$）改进。随后，我们转向损失函数稀疏程度未知的情形。在这种情况下，几何结构的选择本身成为一个在线决策问题。我们首先证明，在OEG和OPGD之间简单切换可能导致线性遗憾，这凸显了几何选择的内在困难。为克服此问题，我们提出一种基于乘性权重的元算法，该算法动态地从一组均匀块范数中进行选择。我们证明，该方法能有效地根据损失的稀疏性调整OMD，从而获得自适应的遗憾保证。总体而言，我们的结果表明，在线镜像映射选择能显著增强OMD在在线凸优化中利用稀疏性的能力。