This paper discovers that the neural network with lower decision boundary (DB) variability has better generalizability. Two new notions, algorithm DB variability and $(\epsilon, \eta)$-data DB variability, are proposed to measure the decision boundary variability from the algorithm and data perspectives. Extensive experiments show significant negative correlations between the decision boundary variability and the generalizability. From the theoretical view, two lower bounds based on algorithm DB variability are proposed and do not explicitly depend on the sample size. We also prove an upper bound of order $\mathcal{O}\left(\frac{1}{\sqrt{m}}+\epsilon+\eta\log\frac{1}{\eta}\right)$ based on data DB variability. The bound is convenient to estimate without the requirement of labels, and does not explicitly depend on the network size which is usually prohibitively large in deep learning.
翻译:本文发现具有较低决策边界变异性(DB variability)的神经网络具有更好的泛化能力。为从算法和数据结构角度度量决策边界变异性,提出了两种新概念:算法DB变异性和$(\epsilon, \eta)$-数据DB变异性。大量实验表明决策边界变异性与泛化能力之间存在显著负相关关系。从理论视角出发,基于算法DB变异性提出了两个下界,该下界不显式依赖于样本规模。我们还证明了基于数据DB变异性的阶为$\mathcal{O}\left(\frac{1}{\sqrt{m}}+\epsilon+\eta\log\frac{1}{\eta}\right)$的上界。该上界无需标签即可便捷估计,且不显式依赖于在深度学习中通常规模过大的网络尺寸。