As supported by abundant experimental evidence, neural networks are state-of-the-art for many approximation tasks in high-dimensional spaces. Still, there is a lack of a rigorous theoretical understanding of what they can approximate, at which cost, and at which accuracy. One network architecture of practical use, especially for approximation tasks involving images, is (residual) convolutional networks. However, due to the locality of the linear operators involved in these networks, their analysis is more complicated than that of fully connected neural networks. This paper deals with approximation of time sequences where each observation is a matrix. We show that with relatively small networks, we can represent exactly a class of numerical discretizations of PDEs based on the method of lines. We constructively derive these results by exploiting the connections between discrete convolution and finite difference operators. Our network architecture is inspired by those typically adopted in the approximation of time sequences. We support our theoretical results with numerical experiments simulating the linear advection, heat, and Fisher equations.
翻译:正如大量实验证据所支持的那样,神经网络在高维空间的许多逼近任务中处于领先地位。然而,对于它们能够以何种成本、何种精度逼近何种函数,目前仍缺乏严格的理论理解。一种在实践中尤其适用于涉及图像的逼近任务的网络架构是(残差)卷积网络。然而,由于这些网络所涉及的线性算子的局部性,其分析比全连接神经网络更为复杂。本文研究时间序列的逼近问题,其中每个观测值是一个矩阵。我们证明,使用相对较小的网络,可以精确表示一类基于直线法的偏微分方程数值离散格式。我们通过利用离散卷积与有限差分算子之间的联系,以构造性方式推导出这些结果。我们的网络架构受到时间序列逼近中通常采用的架构的启发。我们通过模拟线性平流方程、热方程和Fisher方程的数值实验来支持我们的理论结果。