Within the framework of deep learning we demonstrate the emergence of the singular value decomposition (SVD) of the weight matrix as a tool for interpretation of neural networks (NN) when combined with the descrambling transformation--a recently-developed technique for addressing interpretability in noisy parameter estimation neural networks \cite{amey2021neural}. By considering the averaging effect of the data passed to the descrambling minimization problem, we show that descrambling transformations--in the large data limit--can be expressed in terms of the SVD of the NN weights and the input autocorrelation matrix. Using this fact, we show that within the class of noisy parameter estimation problems the SVD may be the structure through which trained networks encode a signal model. We substantiate our theoretical findings with empirical evidence from both linear and non-linear signal models. Our results also illuminate the connections between a mathematical theory of semantic development \cite{saxe2019mathematical} and neural network interpretability.
翻译:在深度学习框架下,我们展示了权重矩阵的奇异值分解(SVD)与解扰变换(一种最近开发的用于解决噪声参数估计神经网络可解释性的技术\cite{amey2021neural})相结合时,可作为神经网络解释的工具。通过考虑传递至解扰最小化问题的数据的平均效应,我们证明在大数据极限下,解扰变换可用神经网络权重的SVD与输入自相关矩阵表示。基于这一事实,我们表明在噪声参数估计问题类别中,SVD可能是训练好的网络编码信号模型所通过的结构。我们通过线性和非线性信号模型的实验证据佐证了理论发现。本研究结果还揭示了语义发展的数学理论\cite{saxe2019mathematical}与神经网络可解释性之间的关联。