Can we identify the weights of a neural network by probing its input-output mapping? At first glance, this problem seems to have many solutions because of permutation, overparameterisation and activation function symmetries. Yet, we show that the incoming weight vector of each neuron is identifiable up to sign or scaling, depending on the activation function. Our novel method 'Expand-and-Cluster' can identify layer sizes and weights of a target network for all commonly used activation functions. Expand-and-Cluster consists of two phases: (i) to relax the non-convex optimisation problem, we train multiple overparameterised student networks to best imitate the target function; (ii) to reverse engineer the target network's weights, we employ an ad-hoc clustering procedure that reveals the learnt weight vectors shared between students -- these correspond to the target weight vectors. We demonstrate successful weights and size recovery of trained shallow and deep networks with less than 10\% overhead in the layer size and describe an `ease-of-identifiability' axis by analysing 150 synthetic problems of variable difficulty.
翻译:我们能否通过探测神经网络的输入-输出映射来识别其权重?乍看之下,由于排列对称性、过参数化以及激活函数对称性,该问题似乎存在多个解。然而,我们证明每个神经元的输入权重向量在符号或缩放意义下是可识别的,具体取决于激活函数类型。我们提出的新方法“扩展与聚类”能够针对所有常用激活函数识别目标网络的层规模与权重。该方法包含两个阶段:(i)为松弛非凸优化问题,我们训练多个过参数化的学生网络以最佳方式模拟目标函数;(ii)为逆向推导目标网络的权重,我们采用一种特设的聚类流程,以揭示学生网络之间共享的已学习权重向量——这些向量对应于目标权重向量。我们成功恢复了经过训练的浅层与深层网络的权重与规模,其层规模开销低于10%,并通过分析150个不同难度的合成问题,描述了一条“可识别性难易度”轴线。