Despite extensive study, the significance of sharpness -- the trace of the loss Hessian at local minima -- remains unclear. We investigate an alternative perspective: how sharpness relates to the geometric structure of neural representations, specifically representation compression, defined as how strongly neural activations concentrate under local input perturbations. We introduce three measures -- Local Volumetric Ratio (LVR), Maximum Local Sensitivity (MLS), and Local Dimensionality -- and derive upper bounds showing these are mathematically constrained by sharpness: flatter minima necessarily limit compression. We extend these bounds to reparametrization-invariant sharpness and introduce network-wide variants (NMLS, NVR) that provide tighter, more stable bounds than prior single-layer analyses. Empirically, we validate consistent positive correlations across feedforward, convolutional, and transformer architectures. Our results suggest that sharpness fundamentally quantifies representation compression, offering a principled resolution to contradictory findings on the sharpness-generalization relationship.
翻译:尽管已有广泛研究,局部最小值处损失海森矩阵迹(即锐度)的重要性仍不明确。我们探索了一个替代视角:锐度如何与神经表征的几何结构相关联,特别是表征压缩——定义为在局部输入扰动下神经激活的集中程度。我们引入了三个度量指标:局部体积比(LVR)、最大局部敏感度(MLS)和局部维度,并通过理论推导证明这些指标受锐度的数学约束:更平坦的极小值必然限制压缩程度。我们将这些约束推广至重参数化不变的锐度度量,并提出了网络级变体指标(NMLS、NVR),这些指标提供了比先前单层分析更严格、更稳定的约束边界。通过实验,我们在前馈网络、卷积网络和Transformer架构中验证了一致的正相关关系。我们的结果表明,锐度从根本上量化了表征压缩程度,为关于锐度与泛化关系的矛盾发现提供了一个原理性解释。