We introduce Hades, an unsupervised algorithm to detect singularities in data. This algorithm employs a kernel goodness-of-fit test, and as a consequence it is much faster and far more scaleable than the existing topology-based alternatives. Using tools from differential geometry and optimal transport theory, we prove that Hades correctly detects singularities with high probability when the data sample lives on a transverse intersection of equidimensional manifolds. In computational experiments, Hades recovers singularities in synthetically generated data, branching points in road network data, intersection rings in molecular conformation space, and anomalies in image data.
翻译:本文提出Hades——一种用于检测数据中奇异点的无监督算法。该算法采用核函数拟合优度检验,因此相较于现有基于拓扑的替代方法,其运行速度更快且可扩展性更强。利用微分几何与最优传输理论工具,我们证明:当数据样本存在于等维流形横截交集中时,Hades能以高概率正确检测奇异点。计算实验表明,Hades可成功恢复合成数据中的奇异点、道路网络数据中的分支点、分子构象空间中的交叉环以及图像数据中的异常区域。