Neural networks are known to develop latent representations that are $aligned$, namely structurally similar across networks trained with different architectures, training protocols, or training datasets. We study this phenomenon in a controlled setting, where we train an ensemble of networks on regression and classification tasks using training sets perturbed by independent realizations of a noise process. We show that the signal-to-noise ratio (SNR) and the training sample size influence the alignment in qualitatively similar ways in networks trained on real-world datasets and in an extremely simple $linear$ network with a single hidden layer, for which the alignment can be estimated analytically. Across linear and nonlinear networks, regression and classification tasks, and both synthetic and real-world data, we consistently observe that alignment varies monotonically with SNR but non-monotonically with training sample size. In particular, the alignment is minimized near the interpolation threshold, and a stronger alignment does not necessarily correspond to better generalization error. These findings reveal a non-trivial dependence of alignment on data quality and quantity, decoupled from generalization performance.
翻译:已知神经网络会发展出潜在表征的$对齐$性,即在采用不同架构、训练协议或训练数据集的网络间,其潜在表征在结构上具有相似性。我们在受控条件下研究这一现象:在回归和分类任务中,使用被独立噪声过程实现扰动后的训练集训练一组网络。结果表明,在真实数据集训练的网络与一个极其简单的$线性$单隐层网络(其对齐性可通过解析方法估计)中,信噪比(SNR)和训练样本量以定性相似的方式影响对齐性。在线性网络与非线性网络、回归任务与分类任务、合成数据与真实数据中,我们一致观察到:对齐性随信噪比单调变化,但随训练样本量呈非单调变化。特别地,对齐性在插值阈值附近最小化,且更强的对齐性并不必然对应更优的泛化误差。这些发现揭示了数据质量与数量(独立于泛化性能)对齐性的非线性依赖关系。