Generalization is a critical property of data-driven models, particularly deep learning models deployed in safety-critical applications. Robustness-based generalization bounds have gained attention as a principled way to link robustness properties to generalization performance, often in a data-dependent manner. However, most existing bounds suffer from vacuousness in practical settings, yielding loose upper bounds that greatly exceed the actual error rates and limiting their usefulness for real-world evaluation. While this issue is often attributed to the uncertainty term, a substantial part of the problem originates from the robustness term itself, particularly for the 0-1 loss. Existing approaches typically treat the robustness term as a global measure, ignoring its variation across different sub-regions of the input space. In this work, we propose a generalization bound that addresses this limitation by scaling the robustness term according to the number of stable and unstable samples within each sub-region. Our bounds incorporate both data- and model-dependent factors while maintaining practical relevance (yielding tighter upper bounds on true error). Experiments on models trained on the ImageNet dataset show that our bounds remain consistently non-vacuous and achieve the tightest estimates among existing methods, closely aligning with empirical performance across a range of robust deep neural networks.
翻译:泛化能力是数据驱动模型的关键属性,尤其是在部署于安全关键场景的深度学习模型中。基于鲁棒性的泛化界作为一种将鲁棒性与泛化性能关联起来的原则性方法(通常以数据相关方式)已受到关注。然而,现有大多数边界在实际场景中存在空洞化问题,给出的松弛上界远超实际误差率,限制了其在现实评估中的实用性。虽然该问题常被归因于不确定性项,但问题的主要来源其实在于鲁棒性项本身,特别是针对0-1损失时。现有方法通常将鲁棒性项作为全局度量,忽视了其在不同输入子区域间的变化。本文提出一种通过根据各子区域内稳定与不稳定样本数量缩放鲁棒性项来克服该局限的泛化界。我们的边界同时融合数据相关与模型相关因素,保持实际相关性(得到比真实误差更紧的上界)。在ImageNet数据集上训练的模型实验表明,我们的边界始终保持非空洞性,且在现有方法中实现了最紧的估计,与各类鲁棒深度神经网络的实证性能高度吻合。