The Gromov-Wasserstein (GW) distance is frequently used in machine learning to compare distributions across distinct metric spaces. Despite its utility, it remains computationally intensive, especially for large-scale problems. Recently, a novel Wasserstein distance specifically tailored for Gaussian mixture models and known as MW (mixture Wasserstein) has been introduced by several authors. In scenarios where data exhibit clustering, this approach simplifies to a small-scale discrete optimal transport problem, which complexity depends solely on the number of Gaussian components in the GMMs. This paper aims to extend MW by introducing new Gromov-type distances. These distances are designed to be isometry-invariant in Euclidean spaces and are applicable for comparing GMMs across different dimensional spaces. Our first contribution is the Mixture Gromov Wasserstein distance (MGW), which can be viewed as a Gromovized version of MW. This new distance has a straightforward discrete formulation, making it highly efficient for estimating distances between GMMs in practical applications. To facilitate the derivation of a transport plan between GMMs, we present a second distance, the Embedded Wasserstein distance (EW). This distance turns out to be closely related to several recent alternatives to Gromov-Wasserstein. We show that EW can be adapted to derive a distance as well as optimal transportation plans between GMMs. We demonstrate the efficiency of these newly proposed distances on medium to large-scale problems, including shape matching and hyperspectral image color transfer.
翻译:Gromov-Wasserstein (GW)距离在机器学习中常用于比较不同度量空间上的分布。尽管其应用广泛,但计算复杂度较高,尤其在大规模问题中。近期,多位学者提出了一种针对高斯混合模型量身定制的新型Wasserstein距离——混合Wasserstein距离(MW)。在数据具有聚类特性的场景下,该方法可简化为一个小规模离散最优传输问题,其复杂度仅取决于高斯混合模型中的高斯分量数量。本文旨在通过引入新型Gromov型距离来扩展MW。这些距离在欧氏空间中具有等距不变性,适用于不同维度空间下高斯混合模型的比较。我们的第一个贡献是混合Gromov Wasserstein距离(MGW),可视为MW的Gromov化版本。该距离具有简洁的离散形式,在估计实际应用中的高斯混合模型间距离时效率极高。为便于推导高斯混合模型间的传输方案,我们提出第二个距离——嵌入Wasserstein距离(EW)。该距离与近期提出的多种Gromov-Wasserstein替代方法密切相关。我们证明EW可同时推导出高斯混合模型间的距离和最优传输方案。最后,我们通过形状匹配与高光谱图像颜色迁移等中大规模问题验证了这些新距离的效率。