基于层间Hessian矩阵的神经网络局部特性分析 (Local properties of neural networks through the lens of layer-wise Hessians)

We introduce a methodology for analyzing neural networks through the lens of layer-wise Hessian matrices. The local Hessian of each functional block (layer) is defined as the matrix of second derivatives of a scalar function with respect to the parameters of that layer. This concept provides a formal tool for characterizing the local geometry of the parameter space. We show that the spectral properties of local Hessians, such as the distribution of eigenvalues, reveal quantitative patterns associated with overfitting, underparameterization, and expressivity in neural network architectures. We conduct an extensive empirical study involving 111 experiments across 37 datasets. The results demonstrate consistent structural regularities in the evolution of local Hessians during training and highlight correlations between their spectra and generalization performance. These findings establish a foundation for using local geometric analysis to guide the diagnosis and design of deep neural networks. The proposed framework connects optimization geometry with functional behavior and offers practical insight for improving network architectures and training stability.

翻译：本文提出一种通过层间Hessian矩阵分析神经网络的方法。每个功能块（层）的局部Hessian矩阵定义为标量函数对该层参数的二阶导数矩阵。该概念为刻画参数空间的局部几何特性提供了形式化工具。我们证明局部Hessian矩阵的谱特性（如特征值分布）能够揭示神经网络架构中过拟合、欠参数化与表达能力相关的量化模式。我们开展了涵盖37个数据集共111组实验的广泛实证研究。结果表明：局部Hessian矩阵在训练过程中呈现一致的结构规律性，其谱特性与泛化性能存在显著相关性。这些发现为运用局部几何分析指导深度神经网络的诊断与设计奠定了基础。所提出的框架将优化几何与函数行为相联系，为改进网络架构和训练稳定性提供了实践洞见。

相关内容

Neural Networks

关注 1652

神经网络（Neural Networks）是世界上三个最古老的神经建模学会的档案期刊:国际神经网络学会(INNS)、欧洲神经网络学会(ENNS)和日本神经网络学会(JNNS)。神经网络提供了一个论坛，以发展和培育一个国际社会的学者和实践者感兴趣的所有方面的神经网络和相关方法的计算智能。神经网络欢迎高质量论文的提交，有助于全面的神经网络研究，从行为和大脑建模，学习算法，通过数学和计算分析，系统的工程和技术应用，大量使用神经网络的概念和技术。这一独特而广泛的范围促进了生物和技术研究之间的思想交流，并有助于促进对生物启发的计算智能感兴趣的跨学科社区的发展。因此，神经网络编委会代表的专家领域包括心理学，神经生物学，计算机科学，工程，数学，物理。该杂志发表文章、信件和评论以及给编辑的信件、社论、时事、软件调查和专利信息。文章发表在五个部分之一:认知科学，神经科学，学习系统，数学和计算分析、工程和应用。官网地址：http://dblp.uni-trier.de/db/journals/nn/

《图机器学习》课程

专知会员服务

49+阅读 · 2024年2月18日

Nat. Biotechnol. | 机器学习为生物库驱动的药物发现提供动力

专知会员服务

11+阅读 · 2022年9月12日

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日