We specialize techniques from topological data analysis to the problem of characterizing the topological complexity (as defined in the body of the paper) of a multi-class data set. As a by-product, a topological classifier is defined that uses an open sub-covering of the data set. This sub-covering can be used to construct a simplicial complex whose topological features (e.g., Betti numbers) provide information about the classification problem. We use these topological constructs to study the impact of topological complexity on learning in feedforward deep neural networks (DNNs). We hypothesize that topological complexity is negatively correlated with the ability of a fully connected feedforward deep neural network to learn to classify data correctly. We evaluate our topological classification algorithm on multiple constructed and open source data sets. We also validate our hypothesis regarding the relationship between topological complexity and learning in DNN's on multiple data sets.
翻译:我们将拓扑数据分析技术专门应用于刻画多类数据集拓扑复杂度(如论文正文中所定义)的问题。作为副产品,我们定义了一种利用数据集开子覆盖的拓扑分类器。该子覆盖可用于构造单纯复形,其拓扑特征(例如贝蒂数)能提供分类问题的相关信息。我们运用这些拓扑结构研究拓扑复杂度对前馈深度神经网络学习的影响,并提出假设:拓扑复杂度与全连接前馈深度神经网络正确分类数据的学习能力呈负相关。我们在多个构造数据集和开源数据集上评估了所提出的拓扑分类算法,同时通过多个数据集验证了关于拓扑复杂度与深度神经网络学习能力之间关系的假设。