Approximation Algorithms for Distributionally Robust Stochastic Optimization with Black-Box Distributions

Two-stage stochastic optimization is a framework for modeling uncertainty, where we have a probability distribution over possible realizations of the data, called scenarios, and decisions are taken in two stages: we make first-stage decisions knowing only the underlying distribution and before a scenario is realized, and may take additional second-stage recourse actions after a scenario is realized. The goal is typically to minimize the total expected cost. A criticism of this model is that the underlying probability distribution is itself often imprecise! To address this, a versatile approach that has been proposed is the {\em distributionally robust 2-stage model}: given a collection of probability distributions, our goal now is to minimize the maximum expected total cost with respect to a distribution in this collection. We provide a framework for designing approximation algorithms in such settings when the collection is a ball around a central distribution and the central distribution is accessed {\em only via a sampling black box}. We first show that one can utilize the {\em sample average approximation} (SAA) method to reduce the problem to the case where the central distribution has {\em polynomial-size} support. We then show how to approximately solve a fractional relaxation of the SAA (i.e., polynomial-scenario central-distribution) problem. By complementing this via LP-rounding algorithms that provide {\em local} (i.e., per-scenario) approximation guarantees, we obtain the {\em first} approximation algorithms for the distributionally robust versions of a variety of discrete-optimization problems including set cover, vertex cover, edge cover, facility location, and Steiner tree, with guarantees that are, except for set cover, within $O(1)$-factors of the guarantees known for the deterministic version of the problem.

翻译：两阶段随机优化是一种建模不确定性的框架，其中我们有一个关于数据可能实现（称为场景）的概率分布，决策分两个阶段进行：在仅知道基础分布且场景尚未实现时做出第一阶段决策，并在场景实现后采取可能的第二阶段补救措施。目标通常是最小化总期望成本。该模型的一个批评是基础概率分布本身往往不精确！为解决这一问题，一种被提出的通用方法是分布鲁棒两阶段模型：给定一组概率分布，现在我们的目标是相对于该集合中的某个分布，最小化最大期望总成本。我们提供了一个框架，用于在集合为中心分布周围的“球”且中心分布仅通过采样黑箱访问时，设计近似算法。首先，我们证明可以利用样本平均逼近（SAA）方法将问题简化为中心分布具有多项式大小支持的情况。然后，我们展示了如何近似求解SAA（即多项式场景中心分布）问题的分数松弛。通过补充能够提供局部（即每场景）近似保证的线性规划舍入算法，我们首次获得了多种离散优化问题（包括集合覆盖、顶点覆盖、边覆盖、设施选址和斯坦纳树）的分布鲁棒版本的近似算法，其保证除了集合覆盖外，均在已知确定性版本问题保证的$O(1)$因子范围内。