Wasserstein distributionally robust optimization (\textsf{WDRO}) is a popular model to enhance the robustness of machine learning with ambiguous data. However, the complexity of \textsf{WDRO} can be prohibitive in practice since solving its ``minimax'' formulation requires a great amount of computation. Recently, several fast \textsf{WDRO} training algorithms for some specific machine learning tasks (e.g., logistic regression) have been developed. However, the research on designing efficient algorithms for general large-scale \textsf{WDRO}s is still quite limited, to the best of our knowledge. \textit{Coreset} is an important tool for compressing large dataset, and thus it has been widely applied to reduce the computational complexities for many optimization problems. In this paper, we introduce a unified framework to construct the $\epsilon$-coreset for the general \textsf{WDRO} problems. Though it is challenging to obtain a conventional coreset for \textsf{WDRO} due to the uncertainty issue of ambiguous data, we show that we can compute a ``dual coreset'' by using the strong duality property of \textsf{WDRO}. Also, the error introduced by the dual coreset can be theoretically guaranteed for the original \textsf{WDRO} objective. To construct the dual coreset, we propose a novel grid sampling approach that is particularly suitable for the dual formulation of \textsf{WDRO}. Finally, we implement our coreset approach and illustrate its effectiveness for several \textsf{WDRO} problems in the experiments.
翻译:Wasserstein分布鲁棒优化(\textsf{WDRO})是一种流行的模型,用于增强机器学习在模糊数据下的鲁棒性。然而,由于求解其“极小极大”公式需要大量计算,\textsf{WDRO}的复杂度在实践中可能难以承受。近年来,针对特定机器学习任务(如逻辑回归)已开发出若干快速\textsf{WDRO}训练算法。但据我们所知,设计通用大规模\textsf{WDRO}高效算法的研究仍十分有限。\textit{核心集}是压缩大型数据集的重要工具,因此被广泛应用于降低许多优化问题的计算复杂度。本文提出一个统一框架,用于为通用\textsf{WDRO}问题构建$\epsilon$-核心集。尽管由于模糊数据的不确定性问题,为\textsf{WDRO}获取传统核心集颇具挑战,我们证明可通过利用\textsf{WDRO}的强对偶性计算“对偶核心集”。此外,该对偶核心集引入的误差可在原始\textsf{WDRO}目标上获得理论保证。为构建对偶核心集,我们提出一种新颖的网格采样方法,特别适用于\textsf{WDRO}的对偶公式。最后,我们实现了所提出的核心集方法,并通过实验展示了其在多个\textsf{WDRO}问题上的有效性。