We present a computationally efficient framework, called $\texttt{FlowDRO}$, for solving flow-based distributionally robust optimization (DRO) problems with Wasserstein uncertainty sets while aiming to find continuous worst-case distribution (also called the Least Favorable Distribution, LFD) and sample from it. The requirement for LFD to be continuous is so that the algorithm can be scalable to problems with larger sample sizes and achieve better generalization capability for the induced robust algorithms. To tackle the computationally challenging infinitely dimensional optimization problem, we leverage flow-based models and continuous-time invertible transport maps between the data distribution and the target distribution and develop a Wasserstein proximal gradient flow type algorithm. In theory, we establish the equivalence of the solution by optimal transport map to the original formulation, as well as the dual form of the problem through Wasserstein calculus and Brenier theorem. In practice, we parameterize the transport maps by a sequence of neural networks progressively trained in blocks by gradient descent. We demonstrate its usage in adversarial learning, distributionally robust hypothesis testing, and a new mechanism for data-driven distribution perturbation differential privacy, where the proposed method gives strong empirical performance on high-dimensional real data.
翻译:我们提出了一种计算高效的框架,称为$\texttt{FlowDRO}$,用于求解具有Wasserstein不确定集的基于流的分布鲁棒优化(DRO)问题,旨在寻找连续的恶劣分布(也称为最不利分布,LFD)并从中采样。要求LFD连续是为了使算法能够扩展到更大样本量的问题,并为由此产生的鲁棒算法实现更好的泛化能力。为了应对计算上具有挑战性的无限维优化问题,我们利用基于流的模型和数据分布与目标分布之间的连续可逆传输映射,开发了一种Wasserstein近端梯度流型算法。在理论上,我们通过最优传输映射建立了解与原公式的等价性,并利用Wasserstein微积分和Brenier定理推导了问题的对偶形式。在实践中,我们通过一系列神经网络对传输映射进行参数化,这些网络通过梯度下降按块逐步训练。我们在对抗学习、分布鲁棒假设检验以及一种新的数据驱动分布扰动差分隐私机制中展示了该方法的用途,所提出的方法在高维真实数据上表现出强大的实证性能。