In order to better understand feature learning in neural networks, we propose a framework for understanding linear models in tangent feature space where the features are allowed to be transformed during training. We consider linear transformations of features, resulting in a joint optimization over parameters and transformations with a bilinear interpolation constraint. We show that this optimization problem has an equivalent linearly constrained optimization with structured regularization that encourages approximately low rank solutions. Specializing to neural network structure, we gain insights into how the features and thus the kernel function change, providing additional nuance to the phenomenon of kernel alignment when the target function is poorly represented using tangent features. We verify our theoretical observations in the kernel alignment of real neural networks.
翻译:为了更好地理解神经网络中的特征学习,我们提出一个框架,用于理解切线特征空间中的线性模型,在该空间中特征可在训练过程中被变换。我们考虑特征的线性变换,从而在双线性插值约束下对参数和变换进行联合优化。我们证明,该优化问题等价于一个具有结构化正则化的线性约束优化问题,该正则化鼓励近似低秩解。通过特化至神经网络结构,我们深入理解特征及核函数如何变化,并为当目标函数在切线特征空间中表征不佳时核对齐现象提供额外细节。我们通过真实神经网络的核对齐验证了理论观察结果。