Topological Data Analysis (TDA) provides a pipeline to extract quantitative topological descriptors from structured objects. This enables the definition of topological loss functions, which assert to what extent a given object exhibits some topological properties. These losses can then be used to perform topological optimizationvia gradient descent routines. While theoretically sounded, topological optimization faces an important challenge: gradients tend to be extremely sparse, in the sense that the loss function typically depends on only very few coordinates of the input object, yielding dramatically slow optimization schemes in practice.Focusing on the central case of topological optimization for point clouds, we propose in this work to overcome this limitation using diffeomorphic interpolation, turning sparse gradients into smooth vector fields defined on the whole space, with quantifiable Lipschitz constants. In particular, we show that our approach combines efficiently with subsampling techniques routinely used in TDA, as the diffeomorphism derived from the gradient computed on a subsample can be used to update the coordinates of the full input object, allowing us to perform topological optimization on point clouds at an unprecedented scale. Finally, we also showcase the relevance of our approach for black-box autoencoder (AE) regularization, where we aim at enforcing topological priors on the latent spaces associated to fixed, pre-trained, black-box AE models, and where we show thatlearning a diffeomorphic flow can be done once and then re-applied to new data in linear time (while vanilla topological optimization has to be re-run from scratch). Moreover, reverting the flow allows us to generate data by sampling the topologically-optimized latent space directly, yielding better interpretability of the model.
翻译:拓扑数据分析(TDA)提供了一套从结构化对象中提取定量拓扑描述符的流程。这使得拓扑损失函数的定义成为可能,此类函数用于判定给定对象在何种程度上展现特定的拓扑性质。这些损失函数随后可通过梯度下降方法用于执行拓扑优化。尽管在理论上完备,拓扑优化面临一个重大挑战:梯度往往极为稀疏,即损失函数通常仅依赖于输入对象的极少坐标,导致实际优化过程极其缓慢。本文聚焦于点云拓扑优化这一核心场景,提出通过微分同胚插值来克服这一局限,将稀疏梯度转化为定义在全空间上的光滑向量场,并具有可量化的Lipschitz常数。特别地,我们证明了该方法能够与TDA中常规使用的子采样技术高效结合——基于子样本计算的梯度所导出的微分同胚,可用于更新完整输入对象的坐标,从而使得前所未有的大规模点云拓扑优化成为可能。最后,我们还展示了该方法在黑箱自编码器(AE)正则化中的适用性:我们的目标是在固定、预训练的黑箱AE模型相关的潜在空间上施加拓扑先验,并证明学习一个微分同胚流只需一次训练即可在线性时间内重新应用于新数据(而传统拓扑优化需从头重新运行)。此外,通过逆转该流动,我们可以直接对拓扑优化的潜在空间进行采样以生成数据,从而提升模型的可解释性。