We combine standard persistent homology with image persistent homology to define a novel way of characterizing shapes and interactions between them. In particular, we introduce: (1) a mixup barcode, which captures geometric-topological interactions (mixup) between two point sets in arbitrary dimension; (2) simple summary statistics, total mixup and total percentage mixup, which quantify the complexity of the interactions as a single number; (3) a software tool for playing with the above. As a proof of concept, we apply this tool to a problem arising from machine learning. In particular, we study the disentanglement in embeddings of different classes. The results suggest that topological mixup is a useful method for characterizing interactions for low and high-dimensional data. Compared to the typical usage of persistent homology, the new tool is sensitive to the geometric locations of the topological features, which is often desirable.
翻译:我们将标准持久同调与图像持久同调相结合,提出了一种刻画形状及其相互作用的新方法。具体而言,我们引入了:(1)混合条形码,用于捕捉任意维度下两个点集之间的几何-拓扑相互作用(混合);(2)简单的汇总统计量——总混合度与总混合百分比,以单一数值量化相互作用的复杂性;(3)用于实践上述方法的软件工具。作为概念验证,我们将此工具应用于一个机器学习领域的问题,具体研究了不同类别在嵌入空间中的解缠结表现。结果表明,拓扑混合是表征低维与高维数据相互作用的一种有效方法。相较于持久同调的典型应用,新工具对拓扑特征的几何位置具有敏感性,这一特性往往更具实际价值。