How can we tell whether two neural networks are utilizing the same internal processes for a particular computation? This question is pertinent for multiple subfields of both neuroscience and machine learning, including neuroAI, mechanistic interpretability, and brain-machine interfaces. Standard approaches for comparing neural networks focus on the spatial geometry of latent states. Yet in recurrent networks, computations are implemented at the level of neural dynamics, which do not have a simple one-to-one mapping with geometry. To bridge this gap, we introduce a novel similarity metric that compares two systems at the level of their dynamics. Our method incorporates two components: Using recent advances in data-driven dynamical systems theory, we learn a high-dimensional linear system that accurately captures core features of the original nonlinear dynamics. Next, we compare these linear approximations via a novel extension of Procrustes Analysis that accounts for how vector fields change under orthogonal transformation. Via four case studies, we demonstrate that our method effectively identifies and distinguishes dynamic structure in recurrent neural networks (RNNs), whereas geometric methods fall short. We additionally show that our method can distinguish learning rules in an unsupervised manner. Our method therefore opens the door to novel data-driven analyses of the temporal structure of neural computation, and to more rigorous testing of RNNs as models of the brain.
翻译:我们如何判断两个神经网络在特定计算中是否利用了相同的内部过程?这一问题对神经科学和机器学习的多个子领域(包括神经人工智能、机制可解释性和脑机接口)都具有重要意义。比较神经网络的常规方法侧重于潜在状态的空间几何结构。然而,在循环网络中,计算是在神经动力学层面实现的,而动力学与几何结构之间并不存在简单的——对应关系。为弥合这一差距,我们引入了一种新的相似性度量方法,在动力学层面对两个系统进行比较。我们的方法包含两个组成部分:首先,利用数据驱动动力学系统理论的最新进展,学习一个高维线性系统,准确捕捉原始非线性动力学的核心特征;其次,通过一种新的普鲁克分析扩展方法对这些线性近似进行比较,该方法考虑了向量场在正交变换下的变化。通过四个案例研究,我们证明该方法能有效识别和区分循环神经网络(RNN)中的动态结构,而几何方法则无法做到。我们还证明,该方法能够以无监督方式区分学习规则。因此,我们的方法为神经计算时间结构的新型数据驱动分析以及更严格地测试RNN作为大脑模型开辟了道路。