Modern neural networks exhibit a striking property: basins of attraction in the loss landscape are often connected by low-loss paths, yet optimization dynamics generally remain confined to a single convex basin and rarely explore intermediate points. We resolve this paradox by identifying entropic barriers arising from the interplay between curvature variations along these paths and noise in optimization dynamics. Empirically, we find that curvature systematically rises away from minima, producing effective forces that bias noisy dynamics back toward the endpoints - even when the loss remains nearly flat. These barriers persist longer than energetic barriers, shaping the late-time localization of solutions in parameter space. Our results highlight the role of curvature-induced entropic forces in governing both connectivity and confinement in deep learning landscapes.
翻译:现代神经网络展现出一个引人注目的特性:损失函数景观中的吸引域通常由低损失路径连接,然而优化动态通常仍局限于单个凸吸引域内,很少探索中间点。我们通过识别沿这些路径的曲率变化与优化动态中的噪声相互作用所产生的熵势垒,解决了这一悖论。实证研究表明,曲率在远离极小值点时系统性地升高,产生将噪声动态偏置回端点的有效力——即使损失保持近乎平坦。这些势垒比能量势垒持续更久,塑造了参数空间中解的晚期局域化。我们的研究结果凸显了曲率诱导的熵力在深度学习景观中同时支配连通性与限制性的作用。