$\varphi$曲线：基于范数容量控制视角下的泛化形态 (The $\varphi$ Curve: The Shape of Generalization through the Lens of Norm-based Capacity Control)

Understanding how the test risk scales with model complexity is a central question in machine learning. Classical theory is challenged by the learning curves observed for large over-parametrized deep networks. Capacity measures based on parameter count typically fail to account for these empirical observations. To tackle this challenge, we consider norm-based capacity measures and develop our study for random features based estimators, widely used as simplified theoretical models for more complex networks. In this context, we provide a precise characterization of how the estimator's norm concentrates and how it governs the associated test error. Our results show that the predicted learning curve admits a phase transition from under- to over-parameterization, but no double descent behavior. This confirms that more classical U-shaped behavior is recovered considering appropriate capacity measures based on models norms rather than size. From a technical point of view, we leverage deterministic equivalence as the key tool and further develop new deterministic quantities which are of independent interest.

翻译：理解测试风险如何随模型复杂度变化是机器学习的核心问题。经典理论难以解释大型过参数化深度网络所呈现的学习曲线。基于参数数量的容量度量通常无法解释这些经验观察结果。为应对这一挑战，我们考虑基于范数的容量度量，并以随机特征估计器（广泛用作复杂网络的简化理论模型）为研究对象展开分析。在此框架下，我们精确刻画了估计器范数的集中特性及其对测试误差的调控机制。研究结果表明：预测学习曲线呈现从欠参数化到过参数化的相变现象，但未出现双下降行为。这证实了基于模型范数（而非规模）的恰当容量度量能够恢复更经典的U形行为。从技术角度看，我们以确定性等价为核心工具，并进一步构建了具有独立价值的新型确定性量。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

31+阅读 · 2021年9月29日

Query2box: 使用盒嵌入对向量空间中的知识图谱进行推理，Query2box: Reasoning over Knowledge Graphs in Vector Space Using Box Embeddings

专知会员服务

46+阅读 · 2020年5月11日