Double descent presents a counter-intuitive aspect within the machine learning domain, and researchers have observed its manifestation in various models and tasks. While some theoretical explanations have been proposed for this phenomenon in specific contexts, an accepted theory to account for its occurrence in deep learning remains yet to be established. In this study, we revisit the phenomenon of double descent and demonstrate that its occurrence is strongly influenced by the presence of noisy data. Through conducting a comprehensive analysis of the feature space of learned representations, we unveil that double descent arises in imperfect models trained with noisy data. We argue that double descent is a consequence of the model first learning the noisy data until interpolation and then adding implicit regularization via over-parameterization acquiring therefore capability to separate the information from the noise.
翻译:双下降现象在机器学习领域中呈现出反直觉的特性,研究人员已在多种模型和任务中观察到其表现。尽管已有特定场景下的理论解释被提出,但针对深度学习中该现象成因的公认理论仍尚未建立。本研究重新审视双下降现象,证实其发生与噪声数据的存在密切相关。通过对学习表征的特征空间进行综合分析,我们揭示双下降现象出现在使用噪声数据训练的不完美模型中。我们认为双下降是模型先通过插值学习噪声数据,再通过过参数化引入隐式正则化,从而获得分离信息与噪声能力的结果。