We combine standard persistent homology with image persistent homology to define a novel way of characterizing shapes and interactions between them. In particular, we introduce: (1) a mixup barcode, which captures geometric-topological interactions (mixup) between two point sets in arbitrary dimension; (2) simple summary statistics, total mixup and total percentage mixup, which quantify the complexity of the interactions as a single number; (3) a software tool for playing with the above. As a proof of concept, we apply this tool to a problem arising from machine learning. In particular, we study the disentanglement in embeddings of different classes. The results suggest that topological mixup is a useful method for characterizing interactions for low and high-dimensional data. Compared to the typical usage of persistent homology, the new tool is sensitive to the geometric locations of the topological features, which is often desirable.
翻译:我们将标准持续同调与图像持续同调相结合,定义了一种刻画形状及其之间相互作用的新方法。具体而言,我们提出了:(1) 一种混合条形码(mixup barcode),用于捕获任意维度下两个点集之间的几何-拓扑相互作用(混合);(2) 简单的汇总统计量——总混合量(total mixup)和总混合百分比(total percentage mixup),它们将相互作用的复杂度量化为单一数值;(3) 一个用于操作上述内容的软件工具。作为概念验证,我们将该工具应用于机器学习中的一个问题。具体而言,我们研究了不同类别嵌入中的解缠结现象。结果表明,拓扑混合方法是刻画低维和高维数据相互作用的有效手段。与持续同调的典型用法相比,新工具对拓扑特征的几何位置更为敏感,这通常是人们所期望的。