This paper investigates group distributionally robust optimization (GDRO) with the goal of learning a model that performs well over $m$ different distributions. First, we formulate GDRO as a stochastic convex-concave saddle-point problem, which is then solved by stochastic mirror descent (SMD) with $m$ samples in each iteration, and attain a nearly optimal sample complexity. To reduce the number of samples required in each round from $m$ to 1, we cast GDRO as a two-player game, where one player conducts SMD and the other executes an online algorithm for non-oblivious multi-armed bandits, maintaining the same sample complexity. Next, we extend GDRO to address scenarios involving imbalanced data and heterogeneous distributions. In the first scenario, we introduce a weighted variant of GDRO, enabling distribution-dependent convergence rates that rely on the number of samples from each distribution. We design two strategies to meet the sample budget: one integrates non-uniform sampling into SMD, and the other employs the stochastic mirror-prox algorithm with mini-batches, both of which deliver faster rates for distributions with more samples. In the second scenario, we propose to optimize the average top-$k$ risk instead of the maximum risk, thereby mitigating the impact of outlier distributions. Similar to the case of vanilla GDRO, we develop two stochastic approaches: one uses $m$ samples per iteration via SMD, and the other consumes $k$ samples per iteration through an online algorithm for non-oblivious combinatorial semi-bandits.
翻译:本文研究组分布鲁棒优化(GDRO)问题,旨在学习一个在$m$个不同分布上均表现良好的模型。首先,我们将GDRO表述为一个随机凸凹鞍点问题,并通过每轮迭代使用$m$个样本的随机镜像下降法(SMD)进行求解,获得了近乎最优的样本复杂度。为将每轮所需样本数从$m$降至1,我们将GDRO建模为双人博弈:一方执行SMD,另一方运行针对非遗忘式多臂老虎机的在线算法,同时保持相同的样本复杂度。其次,我们将GDRO扩展至处理数据不平衡与异质分布的场景。针对第一种场景,我们提出GDRO的加权变体,实现依赖于各分布样本数量的分布相关收敛速率。我们设计两种满足样本预算的策略:一种将非均匀采样融入SMD,另一种采用小批量随机镜像近端算法,二者均能为样本量更大的分布提供更快的收敛速率。针对第二种场景,我们提出优化平均前$k$风险以替代最大风险,从而减轻异常分布的影响。与原始GDRO类似,我们开发两种随机方法:一种通过SMD在每轮迭代中使用$m$个样本,另一种通过非遗忘式组合半老虎机在线算法在每轮迭代中消耗$k$个样本。