Understanding how real data is distributed in high dimensional spaces is the key to many tasks in machine learning. We want to provide a natural geometric structure on the space of data employing a ReLU neural network trained as a classifier. Through the Data Information Matrix (DIM), a variation of the Fisher information matrix, the model will discern a singular foliation structure on the space of data. We show that the singular points of such foliation are contained in a measure zero set, and that a local regular foliation exists almost everywhere. Experiments show that the data is correlated with leaves of such foliation. Moreover we show the potential of our approach for knowledge transfer by analyzing the spectrum of the DIM to measure distances between datasets.
翻译:理解真实数据在高维空间中的分布是机器学习中诸多任务的关键。我们希望利用作为分类器训练的ReLU神经网络,为数据空间提供一种自然的几何结构。通过数据信息矩阵(DIM)——费舍尔信息矩阵的一种变体,该模型将识别出数据空间上的奇异叶状结构。我们证明此类叶状结构的奇异点包含在一个测度为零的集合中,且局部正则叶状结构几乎处处存在。实验表明数据与此类叶状结构的叶层具有相关性。此外,我们通过分析DIM的谱来测量数据集之间的距离,展示了该方法在知识迁移方面的潜力。