Matching a source to a target probability measure is often solved by instantiating a linear optimal transport (OT) problem, parameterized by a ground cost function that quantifies discrepancy between points. When these measures live in the same metric space, the ground cost often defaults to its distance. When instantiated across two different spaces, however, choosing that cost in the absence of aligned data is a conundrum. As a result, practitioners often resort to solving instead a quadratic Gromow-Wasserstein (GW) problem. We exploit in this work a parallel between GW and cost-regularized OT, the regularized minimization of a linear OT objective parameterized by a ground cost. We use this cost-regularized formulation to match measures across two different Euclidean spaces, where the cost is evaluated between transformed source points and target points. We show that several quadratic OT problems fall in this category, and consider enforcing structure in linear transform (e.g. sparsity), by introducing structure-inducing regularizers. We provide a proximal algorithm to extract such transforms from unaligned data, and demonstrate its applicability to single-cell spatial transcriptomics/multiomics matching tasks.
翻译:将源概率测度与目标概率测度匹配通常通过实例化线性最优输运(OT)问题来解决,该问题由量化点间差异的地面代价函数参数化。当这些测度位于同一度量空间时,地面代价通常默认为其距离。然而,当实例化跨两个不同空间时,在缺乏对齐数据的情况下选择该代价是一个难题。因此,实践者常转而求解二次Gromov-Wasserstein(GW)问题。本文利用GW与代价正则化OT之间的并行性——即由地面代价参数化的线性OT目标的正则化最小化。我们采用这种代价正则化公式来匹配跨两个不同欧几里得空间的测度,其中代价在变换后的源点与目标点之间进行评估。我们证明若干二次OT问题属于此类,并通过引入结构诱导正则化来考虑在线性变换中施加结构(例如稀疏性)。我们提出一种近端算法,从未对齐数据中提取此类变换,并展示其在单细胞空间转录组学/多组学匹配任务中的适用性。