We study an infinite-dimensional optimization problem that aims to identify the Nemytskii operator in the nonlinear part of a prototypical semilinear elliptic partial differential equation (PDE) which minimizes the distance between the PDE-solution and a given desired state. In contrast to previous works, we consider this identification problem in a low-regularity regime in which the function inducing the Nemytskii operator is a-priori only known to be an element of $H^1_{loc}(\mathbb{R})$. This makes the studied problem class a suitable point of departure for the rigorous analysis of training problems for learning-informed PDEs in which an unknown superposition operator is approximated by means of a neural network with nonsmooth activation functions (ReLU, leaky-ReLU, etc.). We establish that, despite the low regularity of the controls, it is possible to derive a classical stationarity system for local minimizers and to solve the considered problem by means of a gradient projection method. The convergence of the resulting algorithm is proven in the function space setting. It is also shown that the established first-order necessary optimality conditions imply that locally optimal superposition operators share various characteristic properties with commonly used activation functions: They are always sigmoidal, continuously differentiable away from the origin, and typically possess a distinct kink at zero. The paper concludes with numerical experiments which confirm the theoretical findings.
翻译:本文研究一个无限维优化问题,其目标是在原型半线性椭圆偏微分方程(PDE)的非线性部分中识别出能够最小化PDE解与给定期望状态之间距离的奈梅茨基算子。与以往工作不同,本文在低正则性框架下考虑该识别问题,其中诱导奈梅茨基算子的函数先验仅知为$H^1_{loc}(\mathbb{R})$的元素。这使得所研究的问题类别成为对基于学习信息的PDE训练问题进行严格分析的一个恰当出发点,此类问题中未知叠加算子通过具有非光滑激活函数(ReLU、泄漏ReLU等)的神经网络来近似。我们证明,尽管控制函数具有低正则性,仍可推导出局部极小点的经典驻定系统,并通过梯度投影方法求解该问题。文中在函数空间设定下证明了所提算法的收敛性。此外,我们表明所建立的一阶最优性必要条件意味着局部最优叠加算子与常用激活函数共享多种特征性质:它们始终为S型函数,在原点处连续可微,且通常于零处具有明显的折点。本文最后通过数值实验验证了理论结果。