Recurrent Neural Networks (RNNs) are general-purpose parallel-sequential computers. The program of an RNN is its weight matrix. How to learn useful representations of RNN weights that facilitate RNN analysis as well as downstream tasks? While the mechanistic approach directly looks at some RNN's weights to predict its behavior, the functionalist approach analyzes its overall functionality-specifically, its input-output mapping. We consider several mechanistic approaches for RNN weights and adapt the permutation equivariant Deep Weight Space layer for RNNs. Our two novel functionalist approaches extract information from RNN weights by 'interrogating' the RNN through probing inputs. We develop a theoretical framework that demonstrates conditions under which the functionalist approach can generate rich representations that help determine RNN behavior. We release the first two 'model zoo' datasets for RNN weight representation learning. One consists of generative models of a class of formal languages, and the other one of classifiers of sequentially processed MNIST digits.With the help of an emulation-based self-supervised learning technique we compare and evaluate the different RNN weight encoding techniques on multiple downstream applications. On the most challenging one, namely predicting which exact task the RNN was trained on, functionalist approaches show clear superiority.
翻译:循环神经网络(RNN)是通用的并行-序列计算模型。RNN的程序即其权重矩阵。如何学习有助于RNN分析及下游任务的权重有用表示?机制主义方法直接观察特定RNN的权重以预测其行为,而功能主义方法则分析其整体功能——具体而言是其输入-输出映射。我们探讨了多种针对RNN权重的机制主义方法,并为RNN适配了置换等变的深度权重空间层。我们提出的两种新颖功能主义方法通过探测输入“询问”RNN,从而从其权重中提取信息。我们构建的理论框架证明了功能主义方法能够生成丰富表示以确定RNN行为的条件。我们发布了首个用于RNN权重表示学习的两个“模型动物园”数据集:一个由某类形式语言的生成模型构成,另一个由序列处理MNIST数字的分类器构成。借助基于仿真的自监督学习技术,我们在多个下游应用中比较评估了不同的RNN权重编码技术。在最具挑战性的任务——即预测RNN具体受训任务——上,功能主义方法展现出明显优势。