Weak-to-strong generalization refers to the phenomenon where a stronger model trained under supervision from a weaker one can outperform its teacher. While prior studies aim to explain this effect, most theoretical insights are limited to abstract frameworks or linear/random feature models. In this paper, we provide a formal analysis of weak-to-strong generalization from a linear CNN (weak) to a two-layer ReLU CNN (strong). We consider structured data composed of label-dependent signals of varying difficulty and label-independent noise, and analyze gradient descent dynamics when the strong model is trained on data labeled by the pretrained weak model. Our analysis identifies two regimes -- data-scarce and data-abundant -- based on the signal-to-noise characteristics of the dataset, and reveals distinct mechanisms of weak-to-strong generalization. In the data-scarce regime, generalization occurs via benign overfitting or fails via harmful overfitting, depending on the amount of data, and we characterize the transition boundary. In the data-abundant regime, generalization emerges in the early phase through label correction, but we observe that overtraining can subsequently degrade performance.
翻译:弱到强泛化是指一个在较弱模型监督下训练的较强模型能够超越其教师模型的现象。尽管先前的研究试图解释这一效应,但大多数理论见解仅限于抽象框架或线性/随机特征模型。本文从线性CNN(弱)到两层ReLU CNN(强)的角度,对弱到强泛化进行了形式化分析。我们考虑由不同难度的标签相关信号和标签无关噪声组成的结构化数据,并分析了强模型在由预训练弱模型标注的数据上进行梯度下降训练时的动态过程。我们的分析基于数据集的信噪比特性,识别出两种机制——数据稀缺机制和数据充裕机制,并揭示了弱到强泛化的不同作用原理。在数据稀缺机制中,泛化通过良性过拟合实现或由于有害过拟合而失败,具体取决于数据量,我们刻画了其转变边界。在数据充裕机制中,泛化在早期阶段通过标签校正实现,但我们观察到过度训练随后可能导致性能下降。