Wasserstein distributionally robust optimization has recently emerged as a powerful framework for robust estimation, enjoying good out-of-sample performance guarantees, well-understood regularization effects, and computationally tractable reformulations. In such framework, the estimator is obtained by minimizing the worst-case expected loss over all probability distributions which are close, in a Wasserstein sense, to the empirical distribution. In this paper, we propose a Wasserstein distributionally robust estimation framework to estimate an unknown parameter from noisy linear measurements, and we focus on the task of analyzing the squared error performance of such estimators. Our study is carried out in the modern high-dimensional proportional regime, where both the ambient dimension and the number of samples go to infinity at a proportional rate which encodes the under/over-parametrization of the problem. Under an isotropic Gaussian features assumption, we show that the squared error can be recovered as the solution of a convex-concave optimization problem which, surprinsingly, involves at most four scalar variables. Importantly, the precise quantification of the squared error allows to accurately and efficiently compare different ambiguity radii and to understand the effect of the under/over-parametrization on the estimation error. We conclude the paper with a list of exciting research directions enabled by our results.
翻译:Wasserstein分布鲁棒优化近年来作为一种强大的鲁棒估计框架崭露头角,其具有良好的样本外性能保障、清晰的正则化效应以及计算可处理的重新表述形式。在该框架中,估计量通过最小化所有与经验分布在Wasserstein意义下接近的概率分布中的最坏情况期望损失而获得。本文提出一种Wasserstein分布鲁棒估计框架,用于从含噪声线性观测中估计未知参数,并重点分析此类估计量的平方误差性能。我们的研究在现代高维比例 regime 中展开,其中环境维度和样本数量以编码问题欠/过参数化的比例速率趋于无穷。在各项同性高斯特征假设下,我们证明平方误差可表示为凸-凹优化问题的解,令人惊讶的是,该优化问题最多涉及四个标量变量。重要的是,平方误差的精确量化使得能够准确高效地比较不同的模糊半径,并理解欠/过参数化对估计误差的影响。本文以我们的研究成果所启发的若干令人振奋的研究方向作为结尾。