We review distributionally robust optimization (DRO), a principled approach for constructing statistical estimators that hedge against the impact of deviations in the expected loss between the training and deployment environments. Many well-known estimators in statistics and machine learning (e.g. AdaBoost, LASSO, ridge regression, dropout training, etc.) are distributionally robust in a precise sense. We hope that by discussing the DRO interpretation of well-known estimators, statisticians who may not be too familiar with DRO may find a way to access the DRO literature through the bridge between classical results and their DRO equivalent formulation. On the other hand, the topic of robustness in statistics has a rich tradition associated with removing the impact of contamination. Thus, another objective of this paper is to clarify the difference between DRO and classical statistical robustness. As we will see, these are two fundamentally different philosophies leading to completely different types of estimators. In DRO, the statistician hedges against an environment shift that occurs after the decision is made; thus DRO estimators tend to be pessimistic in an adversarial setting, leading to a min-max type formulation. In classical robust statistics, the statistician seeks to correct contamination that occurred before a decision is made; thus robust statistical estimators tend to be optimistic leading to a min-min type formulation.
翻译:本文综述了分布鲁棒优化(DRO)——一种构建统计估计量的原理性方法,旨在抵御训练环境与部署环境之间期望损失偏差带来的影响。统计学与机器学习中的许多著名估计量(如AdaBoost、LASSO、岭回归、Dropout训练等)在精确意义下均具有分布鲁棒性。我们希望通过对这些经典估计量的DRO解释,让可能不太熟悉DRO的统计学者能够通过经典结果与其DRO等价公式之间的桥梁,找到理解DRO文献的途径。另一方面,统计学中的稳健性主题有着消除污染影响的悠久传统。因此,本文的另一目标是阐明DRO与经典统计稳健性之间的差异。我们将看到,这两种本质不同的哲学思想导致截然不同的估计量类型:在DRO中,统计学家针对决策制定后可能发生的环境偏移进行对冲,因此DRO估计量在对抗场景下倾向于悲观,形成极小极大(min-max)类型的表述;而在经典稳健统计学中,统计学家旨在修正决策制定前已发生的污染,因此稳健统计估计量倾向于乐观,形成极小极小(min-min)类型的表述。