一种通用的线性化子空间精化框架用于神经网络 (A universal linearized subspace refinement framework for neural networks)

Neural networks are predominantly trained using gradient-based methods, yet in many applications their final predictions remain far from the accuracy attainable within the model's expressive capacity. We introduce Linearized Subspace Refinement (LSR), a general and architecture-agnostic framework that exploits the Jacobian-induced linear residual model at a fixed trained network state. By solving a reduced direct least-squares problem within this subspace, LSR computes a subspace-optimal solution of the linearized residual model, yielding a refined linear predictor with substantially improved accuracy over standard gradient-trained solutions, without modifying network architectures, loss formulations, or training procedures. Across supervised function approximation, data-driven operator learning, and physics-informed operator fine-tuning, we show that gradient-based training often fails to access this attainable accuracy, even when local linearization yields a convex problem. This observation indicates that loss-induced numerical ill-conditioning, rather than nonconvexity or model expressivity, can constitute a dominant practical bottleneck. In contrast, one-shot LSR systematically exposes accuracy levels not fully exploited by gradient-based training, frequently achieving order-of-magnitude error reductions. For operator-constrained problems with composite loss structures, we further introduce Iterative LSR, which alternates one-shot LSR with supervised nonlinear alignment, transforming ill-conditioned residual minimization into numerically benign fitting steps and yielding accelerated convergence and improved accuracy. By bridging nonlinear neural representations with reduced-order linear solvers at fixed linearization points, LSR provides a numerically grounded and broadly applicable refinement framework for supervised learning, operator learning, and scientific computing.

翻译：神经网络主要采用基于梯度的方法进行训练，但在许多应用中，其最终预测结果仍远未达到模型表达能力可达到的精度。我们引入了线性化子空间精化（LSR），这是一种通用且与架构无关的框架，它利用固定训练网络状态下的雅可比矩阵诱导的线性残差模型。通过在该子空间内求解一个简化的直接最小二乘问题，LSR计算出线性化残差模型的子空间最优解，从而得到一个精化的线性预测器，其精度相比标准梯度训练解有显著提升，且无需修改网络架构、损失函数形式或训练流程。在监督函数逼近、数据驱动的算子学习以及物理信息算子微调等任务中，我们表明，基于梯度的训练通常无法达到这种可实现的精度，即使局部线性化产生的是一个凸优化问题。这一观察表明，损失函数引起的数值病态性（而非非凸性或模型表达能力）可能构成一个主导性的实际瓶颈。相比之下，单步LSR系统地揭示了基于梯度训练未能充分利用的精度水平，常常实现误差数量级的降低。对于具有复合损失结构的算子约束问题，我们进一步引入了迭代LSR，它交替执行单步LSR与监督非线性对齐，将病态的残差最小化问题转化为数值上良态的拟合步骤，从而获得加速收敛和提升的精度。通过在固定线性化点处桥接非线性神经表示与降阶线性求解器，LSR为监督学习、算子学习及科学计算提供了一个数值基础坚实且广泛适用的精化框架。