We combine standard persistent homology with image persistent homology to define a novel way of characterizing shapes and interactions between them. In particular, we introduce: (1) a mixup barcode, which captures geometric-topological interactions (mixup) between two point sets in arbitrary dimension; (2) simple summary statistics, total mixup and total percentage mixup, which quantify the complexity of the interactions as a single number; (3) a software tool for playing with the above. As a proof of concept, we apply this tool to a problem arising from machine learning. In particular, we study the disentanglement in embeddings of different classes. The results suggest that topological mixup is a useful method for characterizing interactions for low and high-dimensional data. Compared to the typical usage of persistent homology, the new tool is sensitive to the geometric locations of the topological features, which is often desirable.
翻译:我们将标准持久同调与图像持久同调相结合,定义了一种表征形状及其相互作用的新方法。具体而言,我们提出:(1)混合条形码,用于捕捉任意维度中两个点集之间的几何-拓扑交互(混合);(2)简单汇总统计量——总混合系数和总混合百分比,以单一数值量化交互的复杂度;(3)用于操作上述工具的软件。作为概念验证,我们将该工具应用于机器学习中的实际问题:研究不同类别嵌入中的解缠特性。结果表明,拓扑混合是表征低维和高维数据交互的有效方法。相较于持久同调的典型应用,新工具对拓扑特征的几何位置具有敏感性,这往往是一个令人期待的特性。