Methods for analyzing representations in neural systems have become a popular tool in both neuroscience and mechanistic interpretability. Having measures to compare how similar activations of neurons are across conditions, architectures, and species, gives us a scalable way of learning how information is transformed within different neural networks. In contrast to this trend, recent investigations have revealed how some metrics can respond to spurious signals and hence give misleading results. To identify the most reliable metric and understand how measures could be improved, it is going to be important to identify specific test cases which can serve as benchmarks. Here we propose that the phenomena of compositional learning in recurrent neural networks (RNNs) allows us to build a test case for dynamical representation alignment metrics. By implementing this case, we show it enables us to test whether metrics can identify representations which gradually develop throughout learning and probe whether representations identified by metrics are relevant to computations executed by networks. By building both an attractor- and RNN-based test case, we show that the new Dynamical Similarity Analysis (DSA) is more noise robust and identifies behaviorally relevant representations more reliably than prior metrics (Procrustes, CKA). We also show how test cases can be used beyond evaluating metrics to study new architectures. Specifically, results from applying DSA to modern (Mamba) state space models, suggest that, in contrast to RNNs, these models may not exhibit changes to their recurrent dynamics due to their expressiveness. Overall, by developing test cases, we show DSA's exceptional ability to detect compositional dynamical motifs, thereby enhancing our understanding of how computations unfold in RNNs.
翻译:分析神经系统中表征的方法已成为神经科学和机制可解释性领域的常用工具。通过比较不同条件、架构和物种下神经元激活的相似性度量,我们能够以可扩展的方式了解信息在不同神经网络中的转换过程。然而,近期研究揭示了某些度量方法可能对虚假信号产生响应,从而导致误导性结果。为确定最可靠的度量标准并改进测量方法,构建可作为基准的具体测试案例至关重要。本文提出,循环神经网络(RNNs)中的组合学习现象为动态表征对齐度量提供了测试案例。通过实现该案例,我们证明其能够检验度量方法是否可识别在学习过程中逐步发展的表征,并探究度量所识别的表征是否与网络执行的计算相关。通过构建基于吸引子和RNN的测试案例,我们表明新型动态相似性分析(DSA)比现有度量方法(Procrustes、CKA)具有更强的噪声鲁棒性,并能更可靠地识别行为相关表征。我们还展示了测试案例在评估度量之外的架构研究中的应用价值。具体而言,将DSA应用于现代状态空间模型(Mamba)的结果表明,与RNNs不同,这些模型可能因其强表达能力而不会出现循环动态的变化。总体而言,通过开发测试案例,我们证明了DSA在检测组合动态模式方面的卓越能力,从而深化了对RNNs中计算展开机制的理解。