We show that, in a precise sense, a broad class of feedforward neural networks learn (have finite sample complexity) in the PAC model: every fixed finite feedforward architecture whose layers are definable in an o-minimal structure has finite sample complexity in the agnostic PAC setting, even with unbounded parameters. This covers standard fixed-size MLPs, CNNs, GNNs, and transformers with fixed sequence length, together with the operations and layers typically used in such architectures, including linear projections, residual connections, attention mechanisms, pooling layers, normalization layers, and admissible positional encodings. Hence, distribution-free learnability for modern non-recurrent architectures is not an exceptional property of particular activations or architecture-specific VC arguments, but a consequence of tame feedforward computation. Our results reposition finite-sample PAC learnability as a baseline rather than a differentiator: they shift the focus of architectural comparison toward inductive biases, symmetries and geometric priors, scalability, and optimization behaviour.
翻译:我们精确地证明:在PAC学习模型中,一大类前馈神经网络具有可学习性(即有限样本复杂度)。具体而言,任意固定规模的前馈架构,若其各层可在o-极小结构内定义,则在不可知PAC设置下(即便参数无界)均具有有限样本复杂度。该结论涵盖标准固定规模的MLP、CNN、GNN及固定序列长度的Transformer,并包含此类架构中常用的运算与层组件,如线性投影、残差连接、注意力机制、池化层、归一化层及可容许位置编码。由此表明,现代非递归架构的分布无关可学习性并非特定激活函数或架构专用VC论证的特例,而是驯顺前馈计算的必然结果。我们的研究将有限样本PAC可学习性定位为基准属性而非区分指标:促使架构比较的关注点转向归纳偏置、对称性与几何先验、可扩展性及优化行为。