This paper presents a computational framework for the concise encoding of an ensemble of persistence diagrams, in the form of weighted Wasserstein barycenters [100], [102] of a dictionary of atom diagrams. We introduce a multi-scale gradient descent approach for the efficient resolution of the corresponding minimization problem, which interleaves the optimization of the barycenter weights with the optimization of the atom diagrams. Our approach leverages the analytic expressions for the gradient of both sub-problems to ensure fast iterations and it additionally exploits shared-memory parallelism. Extensive experiments on public ensembles demonstrate the efficiency of our approach, with Wasserstein dictionary computations in the orders of minutes for the largest examples. We show the utility of our contributions in two applications. First, we apply Wassserstein dictionaries to data reduction and reliably compress persistence diagrams by concisely representing them with their weights in the dictionary. Second, we present a dimensionality reduction framework based on a Wasserstein dictionary defined with a small number of atoms (typically three) and encode the dictionary as a low dimensional simplex embedded in a visual space (typically in 2D). In both applications, quantitative experiments assess the relevance of our framework. Finally, we provide a C++ implementation that can be used to reproduce our results.
翻译:本文提出了一种计算框架,用于将一组持续图以加权Wasserstein重心[100][102]的形式通过原子图字典进行简洁编码。我们引入了一种多尺度梯度下降方法,该方法通过交替优化重心权重与原子图,高效求解相应的最小化问题。该方法利用两个子问题梯度的解析表达式确保快速迭代,并额外采用共享内存并行化技术。在公开数据集上的大量实验表明,对于最大规模的示例,Wasserstein字典计算可在数分钟内完成,证明了方法的效率。我们通过两个应用展示了贡献的实用性:首先,将Wasserstein字典应用于数据缩减,通过用字典中的权重简洁表示持续图实现可靠压缩;其次,提出一种基于少量原子(通常三个)定义的Wasserstein字典的降维框架,并将字典编码为嵌入可视化空间(通常为二维)的低维单纯形。两类应用中的定量实验均验证了本框架的相关性。最后,我们提供了可复现结果的C++实现。