Matching a source to a target probability measure is often solved by instantiating a linear optimal transport (OT) problem, parameterized by a ground cost function that quantifies discrepancy between points. When these measures live in the same metric space, the ground cost often defaults to its distance. When instantiated across two different spaces, however, choosing that cost in the absence of aligned data is a conundrum. As a result, practitioners often resort to solving instead a quadratic Gromow-Wasserstein (GW) problem. We exploit in this work a parallel between GW and cost-regularized OT, the regularized minimization of a linear OT objective parameterized by a ground cost. We use this cost-regularized formulation to match measures across two different Euclidean spaces, where the cost is evaluated between transformed source points and target points. We show that several quadratic OT problems fall in this category, and consider enforcing structure in linear transform (e.g. sparsity), by introducing structure-inducing regularizers. We provide a proximal algorithm to extract such transforms from unaligned data, and demonstrate its applicability to single-cell spatial transcriptomics/multiomics matching tasks.
翻译:将源概率测度匹配到目标概率测度的问题,通常通过实例化一个线性最优输运(OT)问题来解决,该问题由一个量化点间差异的底代价函数参数化。当这些测度处于同一度量空间时,底代价通常默认为该空间的距离。然而,当跨两个不同空间实例化时,在缺乏对齐数据的情况下选择这一代价成为一个难题。因此,实践者常转而求解二次的Gromov-Wasserstein(GW)问题。本文揭示了GW与代价正则化OT之间的平行关系——代价正则化OT是对一个由底代价参数化的线性OT目标进行正则化最小化。我们利用这一代价正则化公式,在跨两个不同欧几里得空间时实现测度匹配,其中代价在变换后的源点与目标点之间进行评估。我们证明多个二次OT问题可归入此类范畴,并通过引入结构诱导正则化项,考虑在线性变换中施加结构(例如稀疏性)。我们提出一种近端算法,用于从非对齐数据中提取此类变换,并展示了其在单细胞空间转录组学/多组学匹配任务中的适用性。