In the era of proliferation of large language and image generation models, the phenomenon of "model collapse" refers to the situation whereby as a model is trained recursively on data generated from previous generations of itself over time, its performance degrades until the model eventually becomes completely useless, i.e the model collapses. In this work, we study this phenomenon in the setting of high-dimensional regression and obtain analytic formulae which quantitatively outline this phenomenon in a broad range of regimes. In the special case of polynomial decaying spectral and source conditions, we obtain modified scaling laws which exhibit new crossover phenomena from fast to slow rates. We also propose a simple strategy based on adaptive regularization to mitigate model collapse. Our theoretical results are validated with experiments.
翻译:在大语言模型与图像生成模型蓬勃发展的时代,"模型崩溃"现象指的是:当模型基于前几代自身生成的数据进行递归训练时,其性能会随时间推移持续退化,直至完全失效,即模型崩溃。本研究在高维回归背景下探讨该现象,通过解析公式定量描绘了该现象在多种情形下的表现规律。在多项式衰减谱与源条件的特殊情形下,我们得到了修正后的标度律,揭示了从快速收敛到慢速收敛的新型交叉现象。我们还提出了一种基于自适应正则化的简单策略以缓解模型崩溃。理论结果得到了实验验证。