Classifying the topology of closed curves is a central problem in low dimensional topology with applications beyond mathematics spanning protein folding, polymer physics and even magnetohydrodynamics. The central problem is how to determine whether two embeddings of a closed arc are equivalent under ambient isotopy. Given the striking ability of neural networks to solve complex classification tasks, it is therefore natural to ask if the knot classification problem can be tackled using Machine Learning (ML). In this paper, we investigate generic shortcut methods employed by ML to solve the knot classification challenge and specifically discover hidden non-topological features in training data generated through Molecular Dynamics simulations of polygonal knots that are used by ML to arrive to positive classifications results. We then provide a rigorous foundation for future attempts to tackle the knot classification challenge using ML by developing a publicly-available (i) dataset, that aims to remove the potential of non-topological feature classification and (ii) code, that can generate knot embeddings that faithfully explore chosen geometric state space with fixed knot topology. We expect that our work will accelerate the development of ML models that can solve complex geometric knot classification challenges.
翻译:闭合曲线的拓扑分类是低维拓扑学中的一个核心问题,其应用范围超越数学领域,涵盖蛋白质折叠、聚合物物理乃至磁流体动力学。关键问题在于如何判定一条闭合弧线的两种嵌入在环境同痕意义下是否等价。鉴于神经网络在解决复杂分类任务方面的卓越能力,自然引出一个问题:纽结分类问题是否能够借助机器学习方法解决。本文研究了机器学习用于解决纽结分类挑战时所采用的通用捷径方法,特别发现了通过多边形纽结分子动力学模拟生成的训练数据中存在的隐藏非拓扑特征,这些特征被机器学习用于达成正向分类结果。进而,我们通过开发一个公开可用的(i)旨在消除非拓扑特征分类可能性的数据集,以及(ii)能够生成忠实探索选定几何状态空间且保持固定纽结拓扑的纽结嵌入的代码,为未来尝试使用机器学习应对纽结分类挑战奠定了严格基础。我们预期此项工作将加速能够解决复杂几何纽结分类挑战的机器学习模型的开发。