Not being able to understand and predict the behavior of deep learning systems makes it hard to decide what architecture and algorithm to use for a given problem. In science and engineering, modeling is a methodology used to understand complex systems whose internal processes are opaque. Modeling replaces a complex system with a simpler, more interpretable surrogate. Drawing inspiration from this, we construct a class of surrogate models for neural networks using Gaussian processes. Rather than deriving kernels for infinite neural networks, we learn kernels empirically from the naturalistic behavior of finite neural networks. We demonstrate our approach captures existing phenomena related to the spectral bias of neural networks, and then show that our surrogate models can be used to solve practical problems such as identifying which points most influence the behavior of specific neural networks and predicting which architectures and algorithms will generalize well for specific datasets.
翻译:理解并预测深度学习系统的行为存在困难,这导致难以针对特定问题选择合适的架构和算法。在科学与工程领域,建模是一种用于理解内部过程不透明的复杂系统的常用方法。建模通过将复杂系统替换为更简单、更具可解释性的代理模型来实现这一目标。受此启发,我们利用高斯过程构建了一类用于神经网络的代理模型。与为无限宽神经网络推导核函数不同,我们通过有限宽神经网络的自然行为经验性地学习核函数。我们证明,该方法能够捕捉现有研究中与神经网络谱偏置相关的现象,并进一步展示代理模型可解决实际问题,例如识别对特定神经网络行为影响最大的数据点,以及预测哪些架构和算法能在特定数据集上实现良好的泛化性能。