While traditional distributionally robust optimization (DRO) aims to minimize the maximal risk over a set of distributions, Agarwal and Zhang (2022) recently proposed a variant that replaces risk with excess risk. Compared to DRO, the new formulation$\unicode{x2013}$minimax excess risk optimization (MERO) has the advantage of suppressing the effect of heterogeneous noise in different distributions. However, the choice of excess risk leads to a very challenging minimax optimization problem, and currently there exists only an inefficient algorithm for empirical MERO. In this paper, we develop efficient stochastic approximation approaches which directly target MERO. Specifically, we leverage techniques from stochastic convex optimization to estimate the minimal risk of every distribution, and solve MERO as a stochastic convex-concave optimization (SCCO) problem with biased gradients. The presence of bias makes existing theoretical guarantees of SCCO inapplicable, and fortunately, we demonstrate that the bias, caused by the estimation error of the minimal risk, is under-control. Thus, MERO can still be optimized with a nearly optimal convergence rate. Moreover, we investigate a practical scenario where the quantity of samples drawn from each distribution may differ, and propose a stochastic approach that delivers distribution-dependent convergence rates.
翻译:传统分布鲁棒优化旨在最小化一组分布上的最大风险,而Agarwal与Zhang(2022)近期提出一种将风险替换为超额风险的变体。相较于分布鲁棒优化,这一新范式——极小化极大超额风险优化具有抑制不同分布间异质噪声影响的优势。然而,超额风险的选取导致了一个极具挑战性的极小化极大优化问题,目前仅存在针对经验极小化极大超额风险优化的低效算法。本文提出直接面向极小化极大超额风险优化的高效随机逼近方法。具体而言,我们利用随机凸优化技术估计每个分布的最小风险,并将极小化极大超额风险优化作为具有偏置梯度的随机凸凹优化问题进行求解。偏置的存在使得现有随机凸凹优化的理论保证不再适用,但我们证明由最小风险估计误差引起的偏置处于可控状态。因此,极小化极大超额风险优化仍能以近乎最优的收敛速率进行优化。此外,我们研究了从各分布抽取样本量可能不同的实际场景,并提出一种能提供分布依赖性收敛速率的随机方法。